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An equitable path to 
decarbonization 


Madrid climate summit will remain 
deadlocked unless developed countries 
accept responsibility for past emissions. 


here is no sign of greenhouse-gas emis- 
sions peaking in the next few years.” 

Inanideal world, sucha stark warning — 
issued by the United Nations Environment 
Programme (UNEP) — would be enough to 
persuade delegates attending this week’s climate talks in 
Madrid to take stronger action against the dangers of cli- 
mate change. But the two-week meeting is unlikely to yield 
such results. Negotiators representing the world’s govern- 
ments are more likely to postpone the hard decisions until 
next year’s talks in Glasgow, UK, when nations are scheduled 
toimprove onthe emissions-reduction pledges they set as 
part of the 2015 Paris climate agreement. 

Negotiators need to solve a number of competing 
problems that date back to the earliest climate talks inthe 
1990s and for which there are no straightforward solutions. 

First, there must be a step-change in efforts to reduce 
emissions and keep warming to within 2 °C of pre-industrial 
temperatures. Here, there is halting progress, although 
momentum is starting to build towards a global commit- 
ment to net-zero emissions by 2050. 

Emissions from wealthier nations seem to have 
stabilized, according to the latest UNEP report. But current 
pledges to reduce emissions are still projected to resultin 
at least 3 °C of warming, and most developed countries are 
not even on track to meet those commitments. 

More drastic reductions must not, however, neglect the 
development needs of the poorest communities — those 
lacking access to sufficient food, water, health care and 
electric power. Progress here has been scant. As we reported 
in September, developed nations have failed to fulfil their 
pledges to provide funding to help poorer countries protect 
themselves. This is despite the fact that it is their past emis- 
sions that are contributing to the extreme climate effects. 
This funding would also enable poorer nations to continue 
to industrialize, but use less carbon in the process. 

In 2010, developed countries pledged US$100 billion 
annually by 2020 towards such help. Some $9.8 billion was 
pledged in October at a donors’ conference in Paris, but the 
United States, whichis inthe process of withdrawing from 
the Paris agreement, was notable in its absence. 

These are some of the reasons why emissions from 
developing nations show few signs of tailing off. China 
has only just caught up with developed states, and its 
per-capita emissions are now close to those of Japan and 
the European Union. Its emissions from coal are projected 
to rise further still. 
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The science shows hard truths. If all countries accept 
the consensus view of scientists, as most say they do, then 
by 2030, emissions must be no more than 50% of current 
levels to keep warming to below 2°C. That would need more 
than just net-zero emissions by 2050 — and includea swifter 
end to coal-fired power and the acceleration of renewable 
energy and electric-vehicle development. Much more fund- 
ing would also be required, so that developing countries 
can both decarbonize and protect vulnerable populations. 

As campaigners — and, increasingly, younger genera- 
tions — urge their national delegates to take real action 
against climate change, they must also urge their govern- 
ments to back their pledges with cash for the poorest. The 
tension between ambition to reduce emissions and the 
demands of equity must be resolved if international climate 
talks are to reach agreement. 


Tackle sickle-cell 
economics 


Most people with the disease will not be able 
to afford the eye-watering costs of treatment. 


here was a time when Olu Akinyanju felt that no 
one was listening. 
In 1994, the physician founded Sickle Cell 
Foundation Nigeria, with a mission to provide 
support for people with sickle-cell disease — a 
hereditary blood disorder that affects 20 million individ- 
ual worldwide. The condition is most common in tropical 
regions of sub-Saharan Africa, but is also found in many 
other parts of the globe. It can cause strokes, organ failure 
and harrowing episodes of excruciating pain. Between 50% 
and 90% of childrenin sub-Saharan Africa and India with the 
disease will die before their fifth birthday. 

For years, Akinyanju tried and failed to get traction with 
the World Health Organization (WHO). And leading health 
policymakers in African countries also had other health and 
development priorities. 

Nowthe landscape is changing. As we describe ina Feature 
onpage 22, sickle-cell disease is finally catching the attention 
of funders, governments and pharmaceutical companies. But 
as they work on innovative ways to tackle the disease, one 
challenge stands out: howto get treatments to those in need. 

Most patients come from communities that have long 
faced discrimination and economic hardship. They can be 
stigmatized, and discussions about the condition tend to 
be rare. That’s partly why, although scientists have known 
the disease’s root molecular cause for 70 years, research has 
produced few newtreatments. 

Butinthe past decade, more support groups have started 
to spring up in Nigeria. Internationally, organizations rang- 
ing from the WHO to the American Society of Hematology 


Nature | Vol576 | 5 December 2019 | 7 


© 2019 Springer Nature Limited. All rights reserved. 


Editorials 


nature 


have made treatments for sickle-cell disease a priority. 
Newborn-screening programmes have been expanding, 
and efforts are being made to deploy an old chemotherapy 
drug called hydroxyurea in Africa to help ease symptoms. 

Last week, the US Food and Drug Administration (FDA) 
approved the first drug, voxelotor, to target the cause of the 
disease. Made by Global Blood Therapeutics in South San 
Francisco, California, it reduces the interactions between 
mutated haemoglobin proteins that lead to the sickled 
blood cells characteristic of the condition. That came hot on 
the heels of the FDA approving a drug called crizanlizumab, 
made by Novartis in Basel, Switzerland, which helps to stop 
the sickled cells from sticking together. 

In October, the US National Institutes of Health (NIH) and 
the Bill & Melinda Gates Foundation in Seattle, Washington, 
announceda landmark programme to develop gene-based 
technologies to treat sickle-cell disease and HIV in Africa. 
Both will contribute US$100 million over the next 4 years, 
and the ambition is to fund treatments into clinical trials 
within 10 years. 

These developments are promising, but they don’t 
address onestark reality. Most people with the disease strug- 
gleto access even basic health care, and the newtreatments 
have a hefty price tag. 

In2017, the FDA approved atreatment called Endari, made 
by Emmaus Medical in Torrance, California. Endari is a for- 
mulation of the amino acid glutamine, and costs $13,000 
a year. Unsurprisingly, US physicians are struggling to get 
insurance companies to foot the bill — meaning that many 
people are unable to access the treatment. 

The first gene therapies for the disease, which involve 
an elaborate procedure much like a stem-cell transplant 
(see page 18), are likely to cost upwards of $1 million per 
patient. And transplant procedures and hospital stays will 
push costs higher. The excitement even of voxelotor’s land- 
mark approval needs to be tempered by the fact that the 
treatment costs $125,000 per year per patient. 

This means that advocates such as Akinyanju cannot yet 
slow down. They have made impressive gains. But alongside 
the growing sums being invested in research and develop- 
ment, foundations, advocates and patients will continue 
to need support — especially for the costs of treatments. 

Researchers can help — not only through their work, but 
also by continuing to pressure the government officials, 
donors and health-care providers with whom they interact 
to consider the issue of who will foot the bill. 

The payment question isn’t confined to sickle-cell dis- 
ease. It bedevils many of the bespoke drugs emerging from 
biomedical research. What is clear is that the current health- 
care models won't work: insurance companies baulk at the 
costs, and public systems often can’t afford them. An answer 
will require the combined efforts of biomedical scientists, 
health-care economists, public-health experts and others. 

The NIH and the Gates foundation wanta future in which 
the disease can be treated with a one-time therapy in an 
outpatient setting — and that is potentially achievable. But 
companies, funders and governments must find ways to 
ensure that the costs are not shouldered by communities 
that have already suffered for too long. 
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Laying the ghost 
of Icarus 


Humanity is finally getting up close and 
personal with Earth’s nearest star. 


nsome ways, NASA's Parker Solar Probe can trace 
its ancestry to the tale of Icarus, the character from 
ancient Greek mythology who took flight by donning 
wings made from feathers and wax. Ignoring advice 
from his wise father, Daedalus, Icarus flew too close to 
the Sun, causing the wax to melt, and plunged to his death. 

Inthe spirit of the Icarus legend, the Parker Solar Probe 
is one of the most daring space missions ever launched, 
but there’s no metaphorical melting wax. The probe’s 
cutting-edge scientific instruments live behind a car- 
bon-composite heat shield 11 centimetres thick that can 
withstand temperatures of almost 1,400 °C. 

The mission’s achievements are thanks in no small meas- 
ure to the work of teams at the Johns Hopkins University 
Applied Physics Laboratory in Laurel, Maryland, who built 
the $1.5-billion probe and designed its trajectory. 

The probe was originally supposed to start its journey 
by flying past Jupiter — the idea being that Jupiter’s gravita- 
tional influence would hurl it out of the plane of the planets 
and over the Sun’s poles, from where it would record its 
measurements. But Yanping Guo, a celestial navigator at 
the Maryland lab, found a way to send it past Venus instead. 
This, she reasoned, would keep the probe ona pathinthe 
planetary plane and would mean the spacecraft could visit 
the Sun more often and spend more time close to the star. 
Since its 2018 launch, the probe has passed close to the 
Sun 3 times — and it will do so another 21 times in the 
next 6 years, sending back exclusive data from the Solar 
System’s hottest and most dangerous object. 

This week, a News & Views article (D. Verscharen Nature 
https://doi.org/10.1038/d41586-019-03665-3; 2019) dis- 
cusses four papers, published in Nature, that report the 
first of the probe’s discoveries, resolving mysteries such 
as the birthplace of the energetic particles that make up 
the solar wind, which floods interplanetary space. 

Astrophysicist Eugene Parker at the University of 
Chicago in Illinois proposed the existence of the solar wind 
more than 60 years ago (E. N. Parker Phys. Fluids 1, 171-187; 
1958). At that time, few of his peers accepted that he was on 
to something. Now, at the age of 92, Parker canjustifiably 
revel in the data from the spacecraft named after him. 

The Parker Solar Probe has many more solar flybys ahead 
of it, taking it progressively closer to the star. The space- 
craft has yet to cross along-anticipated boundary into the 
Sun’s corona, or outer atmosphere; beyond that lies a ‘here 
be dragons’ realm that no one has ever seen. 

The ghost of Icarus has finally been laid to rest. Much 
more science is sure to come. 
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A personal take on science and society 


World view 


By Anna Hatch 


To fix research assessment, 
swap slogans for definitions 


Evaluation reforms will go round in circles 
without conceptual clarity, warns Anna Hatch. 


he need for clarity extends beyond how we 
communicate science to how we evaluate it. 
Who can really define stock phrases such as ‘a 
significant contribution to research’? Or under- 
stand what ‘highimpact’ or ‘world-class’ mean? 

Seven years ago this month, scientists met in San Fran- 
cisco, California, to call for an end to the practice of assess- 
ing research through the impact factors of the journals in 
which it is published. They demanded that institutions 
instead be explicit about their criteria and consider all 
scholarly outputs — preprints, code, data, peer review, 
teaching, mentoring and so on. Today, thousands have 
signed the resulting Declaration on Research Assessment 
(DORA). But actual change is all too slow. 

Two years ago, the DORA steering committee hired me to 
survey practices in research assessment and promote the 
best ones. Other efforts have similar goals. These include 
the Leiden Manifesto and the HuMetricsHSS Initiative. 

My view is that most assessment guidelines permit 
sliding standards: instead of clearly defined terms, they give 
us feel-good slogans that lack any fixed meaning. Facing 
the problem will get us much of the way towards a solution. 

Broad language increases room for misinterpretation. 
‘High impact’ can be code for where research is published. 
Orit can mean the effect that research has had onits field, or 
onsociety locally or globally — often very different things. 
Yet confusion isthe least of the problems. Descriptors such 
as ‘world-class’ and ‘excellent’ allow assessors to vary com- 
parisons depending on whose work they are assessing. 
Academia cannot be a meritocracy if standards change 
depending on whom we are evaluating. Unconscious bias 
associated with factors such as a researcher’s gender, eth- 
nic origin and social background helps to perpetuate the 
status quo. It was only with double-blind review of research 
proposals that women finally got fair access to the Hubble 
Space Telescope. Research suggests that using words such 
as ‘excellence’ in the criteria for grants, awards and promo- 
tioncan contribute to hypercompetition, in part through 
the ‘Matthew effect’, in which recognition and resources 
flow mainly to those who have already received them. 

Many strategies exist toimprove equity in academia, but 
conceptual clarity is paramount. A study probing the use 
of ‘outcome’ and ‘impact’ in international-development 
work concluded that such terms undermine evaluation 
efforts. It proposed a combination of strategies including 
the use of meaningful qualifiers, such as the type of result 
and howit relates to a project’s purpose, and the creation of 
mutually exclusive definitions for terms such as ‘outcome’, 
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‘impact’ and ‘output’ (B. Belcher & M. Palenberg Am. J. Eval. 
39, 478-495; 2018). 

Some people say that excellence is easy to identify 
because ‘you knowit when you see it’. But Nobel prizes have 
been awarded for research that was not immediately recog- 
nized as a major breakthrough. And it becomes practically 
impossible to distinguish shades of excellence when many 
qualified applicants compete for limited funds. 

Being explicit about how specific qualities are valued 
leads assessors to think critically about whether those qual- 
ities are truly being considered. Achieving that conceptual 
clarity requires discussion with faculty members, staff and 
students: hours and hours of it. The University Medical 
Center Utrecht in the Netherlands, for example, held a 
series of conversations, each involving 20-60 research- 
ers, and then spent another year revising its research 
assessment policies to recognize societal impacts. 

Although DORA curates examples of good practice (see 
go.nature.com/2qkcssw), most of the best efforts cannot 
(yet) be found in databases or publications. Often the only 
way to learn about them is through discussion and network- 
ing. It was not until DORA held a meeting with the Howard 
Hughes Medical Institute in Chevy Chase, Maryland, in 
October that I learnt the University of California, Irvine, had 
moved to include collaborative scholarship in evaluations. 
Ittookan e-mail exchange to learn of an administrator’s per- 
sonal efforts to find tools that explicitly credit collaboration. 

Frank conversations about what is valued ina particular 
context, or at a specific institution, are an essential first 
step in developing concrete recommendations. Although 
ambiguous terms, for instance ‘world-class’ and ‘signif- 
icant’, are a hindrance when performing assessments, 
university administrators have also told me that they rely 
on flexible language to make room to reward a variety of 
contributions. So it makes sense that more specific lan- 
guage in review, promotion and tenure guidelines must 
be able to accommodate varied outputs, outcomes and 
impacts of scholarly work. 

The joint meeting of the American Society for Cell 
Biology and the European Molecular Biology Organization 
in Washington DC this month will include a mock faculty-re- 
cruitment exercise, involving approaches suchas removing 
applicant names and journal titles from bibliographies. 
Participants will then discuss which standards to apply to 
improve objectivity, and how to apply them. 

Setting such standards will be tough. It will be tempting 
to fall back onthe misleading simplicity of metrics suchas 
impact factors, or on ambiguous terms that can be agreed 
to by everyone but applied judiciously by no one. It is too 
early to know what those standards will be or how much 
they will vary, but the right discussions are starting to 
happen. They must continue. 
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The world this week 


Newsin brief 


MALARIA CASES DECREASE 


WORLDWIDE 


The number of malaria 
infections recorded globally 
has fallen for the first time in 
several years, according to the 
World Health Organization 
(WHO), which published its 
annual World Malaria Report on 
4 December. 

Rising numbers of cases in 
2016 and 2017 sparked fears 
that progress had stalled in 
the global fight against the 
mosquito-borne disease. But 
the WHO estimates that there 
were 228 million reported cases 
in 2018, a decrease of around 
3 million from the previous year. 

This drop can be attributed 
in large part to fewer cases in 
southeast Asia (see ‘Malaria 
in southeast Asia’). The WHO 
found that, in the past decade, 
the most marked decline 
has been in six countries 
across the Mekong River 
basin — Cambodia, China, 
Laos, Myanmar, Thailand and 
Vietnam. 

From 2010 to 2018, malaria 
cases dropped by 76% in these 
countries, and malaria-related 
deaths fell by 95%. In 2018, 
Cambodia reported zero 
malaria-related deaths for 
the first time in the country’s 
history. India also reported a 
huge reduction in infections, 


with 2.6 million fewer cases in 
2018 than in 2017. 

Data on malaria can be 
inaccurate in countries with 
poor surveillance systems, warns 
Arjen Dondorp, deputy director 
of the Mahidol Oxford Tropical 
Medicine Research Unit in 
Bangkok. And even if the number 
of officially reported deaths is 
zero, he adds, this doesn’t mean 
that there are no malaria-related 
casualties. However, “malaria 
cases are definitely going down” 
in countries such as Cambodia, 
he says. 

Progress has stalled and 
even reversed in other parts of 
the world. Africa, for example, 
reported an increase of 1 million 
cases from 2017 to 2018, and the 
continent accounted for 94% of 
global cases and deaths from the 
disease in 2018. 

Pedro Alonso, director of the 
WHO Global Malaria Programme 
in Geneva, Switzerland, says 
that, despite the global drop 
in 2018, malaria cases have 
stabilized at “unacceptably high 
numbers” over the past few 
years. “But this is not a helpless 
situation,” he says, noting that 
improved efforts to prevent, 
detect and treat the disease are 
allowing several countries to 
successfully eliminate malaria. 


MALARIA IN SOUTHEAST ASIA 


The prevalence of malaria has fallen in southeast Asia. Last year, this contributed to 
an overall drop in cases globally, despite increases in Africa. 
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EUROPEAN SPACE 
BUDGET GETS 
MASSIVEBOOST 


The European Space Agency 
(ESA) has secured a 45% budget 
boost. At a meeting in Seville, 
Spain, on 27-28 November, 
ministers pledged €12.5 billion 
(US$13.8 billion) for 2020-22, 
compared with the €8.6 billion 
approved at their 2016 meeting. 
ESA’s basic-science projects 
got a10% hike, the biggest in 
25 years. That will allow the 
agency to bring forward its 
space-based gravitational- 
wave mission, the Laser 
Interferometer Space Antenna 
(LISA), by two years, from 
2034 to 2032, allowing it to 
observe astrophysical events in 
tandem with ESA’s Athena X-ray 
telescope, set to launch in 2031. 
As part of anew €432-million 
‘space safety’ budget stream, 
European nations also backed a 
science and planetary-defence 
mission. For human and robotic 
exploration, they earmarked 
nearly €2 billion, with around 
€300 million to build modules 
for NASA’s Moon-orbiting 
Gateway, as well as €150 million 
for robotic lunar missions. 
Meanwhile, Europe’s flagship 
Earth-observation programme, 
Copernicus (pictured), received 
€400 million more than the 
agency had asked for. Other 
projects that can now press 
ahead include the design of 
Europe’s first quantum satellite, 
SAGA, and a project designed 
to demonstrate ways to remove 
space debris from orbit. 
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Armed groups have killed four Ebola responders in the 
eastern Democratic Republic of the Congo (DRC) and 
injured seven others ina series of attacks that began late 
on 27 November, according to the World Health Organi- 
zation (WHO). 

The dead include a vaccination worker, two drivers 
and apolice officer, the agency said. Dozens of aid 
workers have been evacuated from the areas under 
siege, and the Ebola response there has mostly halted. 

The attacks, in Biakato and Mangina, came after vio- 
lence in nearby Beni (pictured) prompted the WHO and 
aid groups to begin evacuating workers from that city 
last week. “We are heartbroken that people have died in 
the line of duty,” WHO director-general Tedros Adhanom 
Ghebreyesus tweeted on 28 November. 

One late-night attack targeted the residence of Ebola 
responders in Biakato. The same night — 27 November — 
armed groups charged an Ebola-response coordination 
centre in Mangina. 

The violence is poised to drive up the number of new 
Ebola cases, the WHO said last week. Ebola has killed 
roughly 2,200 people in the DRC since August 2018. 


CHEMICAL-WEAPONS 
TREATY BANS 
NOVICHOKS 


The group of nerve agents 
known as Novichoks are to be 
added to the Chemical Weapons 
Convention’s list of controlled 
substances, in one of the first 
major changes to the treaty 
since it was agreed in the 1990s. 

The compounds, developed 
by the Soviet Union during the 
cold war, came to prominence 
after they were used ina high- 
profile assassination attempt 
ona former Russian military 
officer, Sergei Skripal, in 
Salisbury, UK, in March last year. 

The Organisation for the 
Prohibition of Chemical 
Weapons, which is tasked with 
enforcing the treaty, announced 
the decision to explicitly ban 
Novichoks on 27 November as 
representatives from the 193 
member states met in The Hague 
in the Netherlands for a periodic 
review of the convention. The 
update will come into effect in 
180 days. 

Novichoks (along with any 
other nerve agents or deadly 
chemicals) were already 
implicitly covered by the 
convention, which bans the use 
of any chemical as a weapon. 

But the specific mention of 
these compounds in the treaty 
— and information about their 
chemical structures — should 
help to raise global awareness of 
the ban among chemists. 


. 9 . ‘ 
CHINESE 
UNIVERSITIES 


CLASSED AS ‘RISKY’ 
COLLABORATORS 


Forty-three Chinese universities 
are considered ‘very high risk’ or 
‘high risk’ collaborators because 
of their involvement in research 
for military and defence 
purposes, according to an 
Australian think tank. A report 
published on 25 November by 
the Australian Strategic Policy 
Institute in Canberra details how 
China is using its universities to 
boost its military prowess. 

The institute also launched 
a database, partly funded by 
the US State Department, that 
classifies the level of risk posed 
by research partnerships with 
some 160 Chinese universities, 
security institutions and 
defence-industry groups. 
Chinese institutions were 
included on the basis of their 
links to defence agencies 
and the Chinese People’s 
Liberation Army (PLA) — such 
as holding security credentials 
for participating in classified 
defence or weapons-technology 
projects, agreements with the 
PLA or other defence-industry 
agencies, or records of the 
institution’s involvement in 
surveillance. 

The analysis comes just 
weeks after the Australian 
government released guidelines 
to help universities reduce the 
threat of foreign entities, such 
as the government of China, 
attempting to leverage activities 
oncampus that are against 
Australia’s interests. 
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The world this week 


News in focus 


Charged particles flow around and away from the Sun. Those that reach Earth can disrupt radio communications. 


SUN-BOMBING CRAFT 
UNCOVERS SECRETS 
OF THE SOLAR WIND 


Surprise magnetic reversals and a fast rotating wind 
mark the first findings from NASA’s Parker Solar Probe. 


By Alexandra Witze 


spacecraft buzzing past the Sun has 
caught the best-ever glimpse of the 
birthplace of the solar wind — the 
stream of energized particles that 

floods outwards from the star. 
NASA’s Parker Solar Probe spotted strange 
spikes in the wind, where particles speed 
up and flip the direction of the wind’s mag- 
netic field. The spacecraft also observed the 
wind rotating around the Sun faster than 
expected — suggesting that scientists’ under- 
standing of how stars’ rotation slows down 


as they age could be wrong. 

The findings, described in four papers 
published on 4 December in Nature’ *, could 
help researchers to better prepare for periods 
when the solar wind is particularly turbulent 
and knocks out radio and other communica- 
tions as it washes over Earth. They are the first 
discoveries from the Parker Solar Probe, which 
launched in 2018 and has made three circuits 
around the Sun so far. 

“We're seeing terrific new plasma astrophys- 
ics in action, right from the beginning,” says 
Stuart Bale, a plasma physicist at the University 
of California, Berkeley. “It’s been spectacular.” 
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The Parker Solar Probe is gradually drawing 
closer to the Sun as it loops around the star. 
The most recent encounter was in September, 
and the next is expected in January. “We’re 
observing ina regime that we’ve only specu- 
lated about before now,” says Sarah Gibson, 
a solar physicist at the National Center for 
Atmospheric Research in Boulder, Colorado. 
The probe is studying the energy that heats 
the Sun’s outer atmosphere, or corona, and 
accelerates the solar wind. 

Althoughscientists can study the solar wind 
as it flows over Earth, doing sois like trying to 
study the origin of a waterfall from halfway 
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down the cliff over which it pours, says Bale. 
“If you want to know the source, you have 
to get up there and get closer — is it coming 
from one hole inthe ground? Froma bunch of 
seams inthe rocks? Is there a sprinkler system 
up there?” 

The Parker Solar Probe measured a portion 
of the solar wind coming from a small hole in 
the Sun’s corona near the equator’. It is the 
closest look yet at one of the solar wind’s 
points of origin. 

The spacecraft also found that, as the wind 
streams out into space, parts of it race ahead in 
high-velocity spikes. “I think of them as rogue 
waves,” says Justin Kasper, a space scientist at 
the University of Michiganin Ann Arbor. Within 
these waves, the speed of the solar wind dou- 
bled, andthe strong flowtemporarily reversed 


“We're seeing terrific new 
plasma physics in action, 
right from the beginning.” 


the wind’s magnetic field’. The probe flew 
through more than 1,000 of these spikes each 
time it zipped past the Sun, Kasper says. Scien- 
tists don’t yet understand what causes them. 

Another surprising finding is how quickly 
the solar wind rotates around the Sun as the 
star spins. Models suggest that the wind flows 
inthis direction at a speed of a few kilometres 
per second — but the Parker Solar Probe meas- 
ured it moving at around 35 to 50 kilometres 
asecond. 

The discovery has major implications. 
Knowing that the wind is rotating at a different 
speed from expected could help researchers 
to improve predictions of when a dangerous 
solar outburst might reach Earth. The find- 
ing also suggests that the solar wind is trans- 
porting more energy away from the Sun than 
previously thought, sothe star’s rotation might 
be slowing more rapidly than expected. If so, 
astronomers might need to revise their ideas 
about how other stars in the Universe age. 

So far, the Parker Solar Probe has studied 
only asmall portion of the Sun at close range. 
More observations are needed to confirm the 
unexpectedly fast rotation speed of the solar 
wind, says Adam Finley, an astronomer at the 
University of Exeter, UK. 

There’s plenty more time for discovery. By 
the end of its mission in 2025, the probe will 
have had 24 close encounters with the Sun — 
getting more than three times closer to the 
star than it has so far. 


1. Bale, S. D. et al. Nature https://doi.org/10.1038/s41586- 
019-1818-7 (2019). 
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CHINESE MINISTRY 
INVESTIGATES IMAGES IN 
TOP ACADEMIC’S PAPERS 


Four journals also say they are examining articles 
co-authored by university president Cao Xuetao. 


Cao Xuetao has been a prominent voice for strengthening research integrity in China. 


By Andrew Silver 


he Chinese education ministry is 

investigating scientific articles 

authored by high-profile immunol- 

ogist and university president Cao 

Xuetao, following suggestions that 
dozens of papers contain potentially prob- 
lematic images. Four journals also say they 
are examining papers from Cao. 

The scrutiny comes after US microbiologist 
Elisabeth Bik raised concerns three weeks 
ago on Twitter and the post-publication 
peer-discussion site PubPeer about images 
in papers written by Cao and his group. 

Cao is the president of Nankai University 
in Tianjin, and his team has pioneered the 
development of cancer immunotherapies in 
China. He says that his group is investigating 
the papers in question, and heis confident that 
the issues raised do not affect the papers’ con- 
clusions. Cao has been a prominent voice for 
strengthening research integrity in China, and 
gave a speech on the topic at the prestigious 
Great Hall of the People in Beijing in November. 

Bik has flagged up potential problems in 
about 50 papers co-authored by Cao on Pub- 
Peer, and other users, most of them anony- 
mous, have raised similar issues concerning 
another handful of papers from the group. 
As Nature went to press, images in 63 papers 
that the team has published in 28 journals since 
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2003 have been flagged on the site. 

Insome papers, Bik says, seemingly identical 
images are labelled as representing different 
biomedical experiments. In others, features 
suchas patterns of dots that represent biologi- 
cal data seem to be “unexpectedly” duplicated 
inthe same image, she says. 

“That would be the equivalent of someone 
showing you a photo of the night sky, and you 
would see two Big Dipper constellations in the 
same photo,’ says Bik, who has developed a 
reputation for spotting and raising potential 
problems in scientific images and figures. 

During a press conference on 22 November, 
Xu Mei, aspokesperson from China’s Ministry 
of Education, said the ministry is investigat- 
ing the articles in question and the “relevant” 
institutions. Caois also director of the Institute 
of Immunology at the Second Military Medi- 
cal University in Shanghai, also known as the 
National Key Laboratory of Medical Immunol- 
ogy. Most of the 63 articles list this affiliation. 

Representatives from 4 of the 28 journals 
concerned — Science, Nature Communications, 
Cardiovascular Researchand Molecular Immu- 
nology —told Nature that they had heard about 
the potentially problematic papers in their 
journals and were reviewing them. 

Bik told Nature that she cannot comment on 
whether the issues she’s flagged up are the result 
of research misconduct. “Itis up tothe affiliated 
institutions to investigate and conclude,” she 
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says. Although Cao’s name is on the papers, 
often asthe corresponding author, itis not clear 
how closely he was involved in the work. 

On17 November, Cao responded on PubPeer 
to Bik’s comments, saying that his team and 
collaborators have made it their priority to 
re-examine the identified manuscripts, raw 
data and lab records. “We'll work with the rel- 
evant journal editorial office(s) immediately 
if our investigation indicates any risk to the 
highest degree of accuracy of the published 
records,” he wrote. 

Healso said he is confident that the conclu- 
sions in those papers remain valid and the work 
reproducible. He apologized for “any over- 
sight on my part” in his role as a mentor, super- 
visor and lab leader, and added that thereis no 
excuse for a lapse in his supervision or lead- 
ership. “I'll use this as an invaluable learning 
opportunity to do better not only in advancing 
science, but also in safeguarding the accuracy 
and integrity of science,” he wrote. 

Cao did not respond to requests for com- 
ment on the issues raised about his team’s 
papers on PubPeer. Nankai University directed 
Nature to Cao’s statement on PubPeer. 

Individuals, including some who seem 
to be Cao’s co-authors, have responded on 
PubPeer to some of Bik’s queries. In at least 
one case, a co-author acknowledges that the 
wrong photograph has been published. In 
another case, commentators suggest that 
images flagged as duplicates by Bik were, in 
fact, pictures of the same cells taken over time, 
but that the figure’s labels were unclear. The 
explanations given in those cases have been 
satisfactory, says Bik. 

In comments about a few other papers, 
Bik questions images that the authors have 
already acknowledged in published errata. 

But the authors have not yet responded to 
questions raised about other papers, in which 
features suchas bars or patterns of dots occur 
multiple times in the same image, she says. 

Several researchers who have not collab- 
orated with Cao or Bik have told Nature that 
the figures she has flagged up seem suspi- 
cious. Nicole La Gruta, a molecular biologist 
at Monash University in Melbourne, Australia, 
says that, in her opinion: “It is clear from the 
multiple images that I have seen that these are 
definitely manipulated.” 

Wouter Masselink, a postdoctoral molecular 
biologist at the Vienna BioCenter in Austria, 
agrees that some of the images require expla- 
nation. “I hope the institutions and universi- 
ties that Cao is associated with launcha formal 
and independent investigation to find out how 
and where these artefacts ended up in the pub- 
lished manuscripts,” he says. 

Bik says she plans to contact the journals 
that published the papers she has identified. 
But the comments on Twitter and PubPeer 
have already caught the attention of some 
journals. Meagan Phelan, a spokesperson for 


Science’s publisher, the American Association 
for the Advancement of Science in Washington 
DC, says Science is reviewing an article in the 
journal that Bik flagged up. She added that it’s 
up to institutions to investigate any possible 
misconduct, which would inform any deci- 
sions the journal made. 

Elisa De Ranieri, the editor-in-chief of Nature 
Communications in London, says the journal 
saw posts on Twitter and PubPeer that raised 
issues Over potential image manipulation 
and will examine any relevant papers as part 
of their usual research-integrity processes. 

Cao received a Nature Award in 2015 
for excellence in mentoring, and he is 
co-editor-in-chief of Cellular & Molecular 


Immunology, ajournal published by Springer 
Nature, which also publishes Nature (Nature’s 
news and comment team is editorially inde- 
pendent ofits publisher, and of other Nature- 
branded journals). A spokesperson for the 
company says it does not appoint the journal’s 
editorial committee. They said the company is 
aware that concerns have been raised around 
some Cao papers but has no further comment. 
On 22 November, Nature Immunology 
posted an ‘Editor’s Note’ on two of Cao’s 
papers. One says the authors had flagged upa 
duplicated image before publication but it was 
not corrected in time; inthe other, the journal 
says a duplicated image was “inadvertently 
introduced during the production process”. 


UN CLIMATE SUMMIT 


SET TO TACKLE 


CARBON MARKETS 


Negotiations take place amid uncertain geopolitics 
and intensifying public pressure. 


By Quirin Schiermeier 


our years after pledging to limit global 

warming to no more than 2°C above 

pre-industrial levels, representatives 

of nearly 200 countries are meeting to 

put the finishing touches to the 2015 
Paris climate accord. 

Discussions at the annual United Nations’ 
climate conference, COP25, are expected to 
focus on international carbon markets, which 
have the potential to reduce the overall cost of 
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Protesters gather in London as part of the Global Climate Strike in November. 


global climate-mitigation efforts. 

But the talks, which started on 2 December 
in Madrid and last until 13 December, take 
place against a backdrop of shifting geopoli- 
tics that has created uncertainty over who will 
lead global efforts to tackle climate change, 
and of intensifying public pressure on govern- 
ments to take action. 

Despite pledges to curb emissions, atmos- 
pheric greenhouse-gas concentrations 
reached a new peak in 2018, the World Mete- 
orological Organization said last week. AUN 


a 
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climate report released on 26 November warns 
that the Paris agreement’s 2°C goal might soon 
be out of reach as emissions continue to rise. 


Unfinished business 


At last year’s conference, nations agreed ona 
set of rules for tracking and reporting green- 
house-gas emissions and for reviewing collec- 
tive progress. However, they failed to establish 
clear rules for carbon markets through which 
emissions made in one country can be offset 
by investing in low-carbon technologies else- 
where. Article 6 of the Paris agreement — which 
aims to promote voluntary international coop- 
eration between nations — is acentral point on 
the agenda, and offsetting will almost certainly 
be discussed. 

Voluntary offsetting schemes are already in 
use to make certain goods and services, suchas 
passenger flights, ‘carbon neutral’. Many coun- 
tries, including New Zealand, Sweden and the 
United Kingdom, rely on offsetting to achieve 
their emission-reduction goals. 

Critics say that offsetting schemes allowrich 
countries to dodge responsibility for cutting 
their own emissions. But a well-organized 
international carbon market with clear, prac- 
tical rules could save up to US$250 billion 
in climate-mitigation costs, says Stefano De 
Clara, a policy adviser at the International 
Emissions Trading Association in Brussels. “It 
would engage businesses in climate action and 
facilitate the linkage of existing carbon pricing 
systems,’ he says. “Inthe end, everyone could 
be better off through collaboration.” 

Analysts have warned that poorly planned 
offsetting schemes could actually hinder 
efforts to curb global emissions. Under the 
Paris agreement, countries must adjust their 
emission-reduction pledges every five years, 
inline with the latest scientific evidence about 
what will be required to stabilize the climate. 
Without proper rules and bookkeeping, off- 
setting could simply move emission-reduction 
efforts around the world, instead of reducing 
overall emissions, says Gilles Dufrasne, an envi- 
ronmental economist with the Brussels-based 
international climate-policy watchdog Carbon 
Market Watch. 

Jacob Werksman, a climate-policy adviser 
at the European Commission, warns that 
there are some sticking points that negoti- 
ators in Madrid might not be able to resolve. 
For example, some countries expect that 
excess carbon credits from the expiring 1997 
Kyoto Protocol, the previous international 
climate treaty, will remain eligible for use 
under the Paris agreement. Suchaconcession 
would “severely undermine’ the agreement, 
Werksman says. 

This year’s talks are also facing intense public 
scrutiny. The rapidly growing climate-protest 
movement is shifting the overall conversation 
on climate change, says Valérie Masson-Del- 
motte, a co-chair of the Intergovernmental 
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Panel on Climate Change. 

Politics are shifting, too. The United States’ 
official withdrawal from the Paris agreement 
puts the nation ina strange position for this 
year’s talks. It will remain amember of the UN 
Framework Convention on Climate Change, 
an international treaty under which both the 
Kyoto Protocol and the Paris agreement were 
negotiated. And US representatives will still 
attend future COP meetings — including next 
year’s meeting in Glasgow, UK. But unless a 
future US government revokes the decision 
to quit the Paris agreement, the country will no 
longer participate in negotiations concerning 


the rules and implementation of the accord. 

There is some hope that the European Union 
will provide new leadership, says Oliver Geden, 
a policy researcher at the German Institute for 
International and Security Affairs in Berlin. On 
28 November, the European Parliament voted 
to declare a ‘climate and environmental emer- 
gency’, which will put pressure on EU member 
states to approve the European Commission’s 
plans to cut emissions by 55% by 2030, and to 
achieve net-zero emissions by 2050. 

“At this time it’s up to the EU to demonstrate 
that the Paris agreement can deliver after all,” 
says Geden. “That’s a tough nut to crack.” 


TARGETED ATTACKS COULD 
MAKE BLOOD-STEM-CELL 
TRANSPLANTS SAFER 


Such procedures show promise for genetic and 
immune disorders, but are currently risky. 


> 


Physicians prepare to take a sample of a patient 


a 


‘s bone marrow. 


By Heidi Ledford 


cientists are experimenting with 
ways to selectively target the body’s 
blood-making cells for destruction. 
Early studies in animals and people 
suggest that the approach could 
make blood-stem-cell transplants — power- 
ful but dangerous procedures that are used 
mainly to treat blood cancers — safer, and 
thereby broaden their use. The studies come 
as evidence piles up that such transplants 
can also be used to treat some autoimmune 
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disorders and genetic diseases. 

Thework, to be presented at the forthcom- 
ing annual meeting of the American Society of 
Hematology in Orlando, Florida, harnesses an 
understanding of the proteins made by differ- 
ent types of blood stem cell, the cells in the 
bone marrowthat produce the various cellular 
components of blood. 

Blood-stem-cell transplants work by replac- 
ing defective blood-making cells — which can 
give rise to blood cancer, as well as to genetic 
and autoimmune diseases — with healthy 
ones, either from donors or from the patients 
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themselves. The idea behind the new targeted 
approaches is to eradicate specific stem cells 
to make room for transplanted cells without 
the side effects of existing treatments, which 
destroy bone marrow cells indiscriminately. 

Physicians currently rely on full-body radi- 
ation or treatment with toxic, DNA-damaging 
chemotherapy drugs to kill existing blood 
stem cells and clear the way for the trans- 
planted cells to repopulate the marrow. That 
preparation kills not only blood stem cells, but 
also ahost of other cells in the marrow. This can 
cause infertility, seed cancers that occur later 
in life, and severely compromise the immune 
system, leading to lengthy hospital stays. 

“It’s really prohibitive for patients,” says 
David Scadden, a stem-cell biologist at Har- 
vard University in Cambridge, Massachusetts. 
“This technology just won't be adopted unless 
we really change the whole dynamic.” 


Stem-cell hotel 


One way to think about stem-cell transplants 
is that the bone marrowisa hotel whose owner 
wants to evict some guests, says Jens-Peter 
Volkmer, vice-president of research at Forty 
Seven, a biotechnology company in Menlo 
Park, California. Current treatments blow up 
the whole hotel, he says. “Then everybody’s 
dead, including all of these critical compo- 
nents that you need to protect the patient 
from infection.” The latest approaches allow 
the owner to tell specific guests to leave — by 
targeting sets of cells in the bone marrow, 
rather than killing them all, Volkmer says. 

Atthe haematology meeting, which begins 
on7 December, researchers from Forty Seven 
will present the results of studies that tested 
a combination of two antibodies in monkeys. 
One antibody blocks the activity of a molecule 
called c-Kit, whichis found on blood stem cells 
andis vital to their function; the other inhibits 
aprotein called CD47, whichis found onsome 
immune cells. Inhibiting CD47 allows those 
immune cells to sweep up the stem cells tar- 
geted by the c-Kit antibody, making way for 
new cells. 

In the tests, the combination reduced the 
number of blood stem cells in bone marrow. 
But the team has not yet demonstrated that 
the treatment clears out enough old cells to 
allow transplanted cells to flourish. 

Another company, Magenta Therapeutics of 
Cambridge, Massachusetts, has collaborated 
with researchers at the US National Institutes 
of Health to test a different antibody, which 
binds to c-Kit and then releases a toxin to kill 
the blood stem cell that produced the protein. 
Data from studies in mice and one monkey sug- 
gest that this can kill off enough stem cells in 
the bone marrow for transplanted cells to 
thrive — without destroying other cells such 
as immune cells. 

Andateamledby transplant physicianJudith 
Shizuru at Stanford University in California 


has tested a similar approach in babies witha 
genetic disorder that cripples the immune sys- 
tem. The researchers, ina collaboration that 
includes the firm Amgen of Thousand Oaks, 
California, used a third antibody that targets 
c-Kit. The team found that transplanted stem 
cells, inthis case from donors who did not have 
the disease, successfully took hold inthe bone 
marrow of four out of six of the babies. 


Expanding market 


These developments come as the potential 
market for blood-stem-cell transplants is 
expanding, says Mani Foroohar, an analyst 
at SVB Leerink investment bank in Boston, 
Massachusetts. 

Some gene therapies, such as one recently 
approved by European regulators to treat a 
genetic immune disorder called ADA-SCID, 
use a version of the technique. They remove 
the patient’s blood stem cells, then geneti- 
cally modify them so that they are free of the 
disorder before infusing them back into the 
body. Magenta and Forty Seven have entered 


into separate collaborations with researchers 
developing gene therapies to treat blood dis- 
orders such as B-thalassaemia and sickle-cell 
disease (see page 22). 

And data are accumulating to show that 
some people with type 1 diabetes, systemic 
scleroderma and other autoimmune disor- 
ders can enter long-lasting remission if the 
mature immune cells in their bone marrow 
are wiped out and replaced with an infusion 
of their own blood stem cells (E. Snarski et al. 
Bone Marrow Transpl. 51, 398-402; 2016; 
K. M. Sullivan et al. N. Engl. J. Med. 378, 35-47; 
2018). The procedure is thought to reset the 
immune system by eradicating cells that are 
attacking the body’s own tissue, says Keith 
Sullivan, a stem-cell transplant physician at 
Duke University in Durham, North Carolina. 

Sullivan says that the early data from 
Shizuru and others are intriguing, and that 
he has begun discussions to collaborate with 
researchers in the field. “The train is moving 
now,” he says. “The question is, how do we do 
this in the right way?” 


CARBON DIOXIDE-EATING 
BACTERIA OFFER HOPE 
FOR GREEN PRODUCTION 


Lab workhorse F. coliengineered to make nutrients 
from greenhouse gas rather than from sugars. 


By Ewen Callaway 


scherichia coliis ona diet. Researchers 

have created a strain of the model 

bacterium — knownas F. coli for short 

— that grows by consuming carbon 

dioxide instead of sugars or other 
organic molecules. 

The achievement is a milestone, say 
scientists, because it drastically alters the 
inner workings of one of biology’s most pop- 
ular model organisms. And, in the future, 
CO,-eating F. coli could be used to make 
organic carbon molecules for biofuels or to 
produce food. 

Products made in this way would have lower 
emissions than those made using conventional 
production methods, and could potentially 
remove the gas from the air. The work was 
published on 27 November (S. Gleizer et al. 
Cell179, 1255-1263; 2019). 

“It’s like a metabolic heart transplanta- 
tion,” says Tobias Erb, a biochemist and 
synthetic biologist at the Max Planck Institute 
for Terrestrial Microbiology in Marburg, 
Germany, who wasn’t involved in the study. 
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Plants and photosynthetic cyanobacteria — 
aquatic microbes that produce oxygen — use 
the energy from light to transform, or fix, CO, 
into the carbon-containing building blocks 
of life, including DNA, proteins and fats. But 
these organisms can be hard to genetically 
modify, which has slowed efforts toturn them 
into biological factories. 

By contrast, F. coliis relatively easy to engi- 
neer, and its fast growth means that changes 


“After about 200 days, 
cells capable of using CO, 
as their only carbon source 
emerged.” 


can be quickly tested and tweaked to optimize 
genetic alterations. But the bacterium pre- 
fers to grow on sugars such as glucose — and 
instead of consuming CO,, it emits the gas as 
waste. 

Ron Milo, a systems biologist at the 
Weizmann Institute of Science in Rehovot, 
Israel, and his team have spent the past 
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decade overhauling F. coli’s diet. In 2016, they 
created a strain that consumed CO,, but the 
compound accounted for only a fraction of 
the organism’s carbon intake — the rest came 
from an organic compound that the bacteria 
were fed, called pyruvate (N. Antonovsky etal. 
Cell166, 115-125; 2016). 


Gas diet 


Inthe latest work, Milo and his team used a mix 
of genetic engineering and laboratory evolu- 
tionto create a strain of £. colithat can get all of 
its carbon from CO,. First, they gave the bacte- 
rium genes that encode a pair of enzymes that 
allow photosynthetic organisms to convert 
CO, into organic carbon. 

Plants and cyanobacteria power this con- 
version with light, but that wasn’t feasible for 
E. coli. Instead, Milo’s team inserted a gene 
that lets the bacterium glean energy from an 
organic molecule called formate. 

Even with these additions, the bacterium 
refused to swap its sugar meals for CO,. To fur- 
ther tweak the strain, the researchers cultured 
successive generations of the modified E. coli 
forayear, giving them only minute quantities 
of sugar, and CO, at concentrations about 
250 times those in Earth’s atmosphere. 

They hoped that the bacteria would evolve 
mutations to adapt to this new diet. After 
about 200 days, the first cells capable of using 
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The model bacterium Escherichia coli. 


CO, as their only carbon source emerged. And 
after 300 days, these bacteria grew faster in 
the lab conditions than did those that could 
not consume CO). 

The CO,-eating, or autotrophic, F. coli 
strains can still grow on sugar — and would use 
that source of fuel over CO, given the choice, 


says Milo. Compared with normal £. coli, which 
can double in number every 20 minutes, the 
autotrophic F. coliare laggards, dividing every 
18 hours when grown in an atmosphere that is 
10% CO,. They are not able to subsist without 
sugar on atmospheric levels of CO, — currently 
0.041%. 


Along way to go 

Miloand his team hope to make their bacteria 
grow faster and live on lower levels of CO). 
They are also trying to understand how the 
E. coli evolved to eat CO,: changes in just 
11 genes seem to have allowed the switch, 
and researchers are now working on finding 
out how. 

The work is a “milestone” and shows the 
power of melding engineering and evolution 
to improve natural processes, says Cheryl 
Kerfeld, a bioengineer at Michigan State 
University in East Lansing. 

Researchers have already used EF. coli to 
make synthetic versions of useful chemicals 
such as insulin and human growth hormone. 
Milo says that his team’s work could expand 
the products the bacteria can make to include 
renewable fuels, food and other substances. 
But he doesn’t see this happening soon. 

“This is a proof-of-concept paper,” agrees 
Erb. “It will take a couple years until we see this 
organism applied.” 
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The Best Antibody Discovery Technology 


Is Now at Your Fingertips 


The Trianni Mouse™ platform is a transgenic antibody discovery platform 
offering the entire human variable gene diversity in a single organism. 


The V(D)J gene segments in The Trianni Mouse are chimeric, but the 
variable domains of antibodies made by the mouse are entirely human. 
The result is human antibody leads generated from antibody genes 


optimized for function in the mouse. 


To learn more about this innovative platform, visit Trianni.com. 


Heavy Chain CDR3 Compositions of a Human Individual and a Trianni Mouse are Almost Identical 
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Feature 


GENE THERAPY S 
TOUGHEST TEST 


As abeleaguered field gains momentum against genetic 
disorders, sickle-cell disease looms as one of its biggest 


challenges. By Heidi Ledford 


mS 


For years, Grajevis Bakatunkanda’s sickle-cell anaemia went undiagnosed. 
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rajevis Bakatunkanda’s mother 
knew the signs: when her son lost 
interest in dinner, that meant the 
pain was onits way. It would strike, 
like clockwork, nearly every week. 
Soon the shy, skinny boy would be at 
the hospital near their home in the 
Democratic Republic of the Congo, 
where doctors would provide morphine for 
the pain and invariably diagnose him with 
malaria. 

It turns out the doctors were wrong. The 
culprits were not parasites, but Bakatunkanda’s 
own red blood cells. Normally soft and 
springy, some of the boy’s cells were becom- 
ing deformed and stiff, like splinters of wood. 
They would lodge in his capillaries, choking the 
blood flow to vital organs and sending waves of 
crushing pain into his back and chest. 

It wasn’t until the family immigrated to Cape 
Town, South Africa, in 2003, that they learned 
Bakatunkanda had sickle-cell anaemia, one 
of the world’s most prevalent genetic disor- 
ders, and one that has been studied for more 
than a century. But the diagnosis did little to 
ease the boy’s pain: the cocktail of drugs that 
he was prescribed — each of them in use for 
more than halfa century and none developed 
specifically for sickle-cell disease — failed to 
break the cycle. 

Now, Bakatunkanda is 22, and modern 
solutions are on the horizon in the form of 
gene therapies. After decades of work and 
some painful setbacks, techniques that involve 
altering a person’s genome have begun to 
win approval for a handful of rare disorders. 
Scientists are now working to extend the 
latest advances — including some that use 
newer gene-editing technologies — to sick- 
le-cell disease, a condition that affects some 
20 million people worldwide (see ‘Howto stop 
sickling’). There are more than half a dozen 
active clinical trials, and more are planned. 
“The studies are just literally coming back to 
back now,” says Lakshmanan Krishnamurti, a 
paediatric haematologist at Emory University 
in Atlanta, Georgia. “It’s a very exciting time.” 

But sickle-cell disease could challenge the 
gene-therapy field both ethically and tech- 
nologically. Gene therapies that have been 
approved for other conditions have come 
with price tags in excess of US$1 million. But 
sickle-cell disease is concentrated in regions of 
the world suchas sub-Saharan Africa, Indiaand 
the Caribbean, where few have the resources 
to foot such a hefty bill. The experimental 
treatments for sickle cell are also complex, 
requiring long hospital stays and the exper- 
tise of large academic medical centres. Even 
for people who can access such resources, the 
risks might not always be worth it. 

As data drift in from early trials, scientists 
are working to improve their approaches, 
and funders have already begun to tackle 
the equity question. On 23 October, the US 
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National Institutes of Health (NIH) and the 
Bill & Melinda Gates Foundation announced 
that they would invest at least $200 million 
over the next four years to bring gene-based 
treatments for sickle-cell disease and HIV to 
low-resource settings. 

Bakatunkanda, who founded a support 
group for people with sickle-cell disease, 
is confident that gene therapy, if shown to 
be effective, will one day reach his country, 
despite its high cost and daunting complex- 
ity. “Definitely it will,” he says. “Because South 
Africais rising.” 

He and others must keep a tight rein on 
their expectations. “My patients with Internet 
access, now they are coming to me: ‘Can we go 
for gene therapy?” says Dipty Jain, a paedia- 
trician at the Government Medical College in 
Nagpur, India. “I advise them, ‘This is not yet 


” 


for you’. 


Amedical revolution 


The elongated, oddly shaped blood cells 
typical of sickle-cell disease were first noted in 
1910 ina young dental student from Grenada, 
West Indies, named Walter Clement Noel’. 
Forty years later, the underpinnings of the 
disease began to come into view, when bio- 
chemist Linus Pauling and his colleagues 
reported that changes in the structure of 
haemoglobin, the oxygen-carrying protein 
found in red blood cells, were altering the 
shape of the cells?. 

The publication marked the first time the 
effects of a genetic disorder had been traced 
to their molecular roots. Pauling dubbed the 
condition “a molecular disease”. Some years 
later, researchers identified changes in the 
B-globin protein as responsible’. A mutation 
in both copies of the gene encoding for this 
protein results in disease; a single mutated 
copy correlates with few symptoms and pro- 
tects the bearer from blood-dwelling parasites 
such as those causing malaria. This, in part, is 
why the disease exists in relatively high rates 
where malaria is endemic. “This is really the 
basis from where everything that we know 
today on human medical genetics has been 
developed,” says Ambroise Wonkam, a genet- 
icist at the University of Cape Town. 

Seventy years after Pauling’s discovery, 
sickle-cell disease is still underdiagnosed in 
many African countries, says haematologist 
Olu Akinyanju, the founder and first chair- 
person of the Sickle Cell Foundation Nigeria 
in Lagos. Yet early diagnosis can save lives. 
More than 300,000 people are born with the 
disease each year, and without prophylactic 
antibiotics and vaccines to help ward off other 
infections, most will die before the age of five. 
Those who survive face a lifetime of risk for 
pain crises, stroke and infection. 

Sickle cell disease’s close association with 
low-income countries has meant that it has 
historically received little attention from 


pharmaceutical companies and governments 
in richer regions. Many African nations have 
such pressing public-health needs that it has 
been difficult to push sickle-cell disease to the 
top of their priority lists, says Akinyanju, who 
has campaigned for decades to get African 
governments to establish treatment plans. 
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Over the past ten years, however, Akinyanju 
and others have noticed a shift. As advocates 
and clinicians push for newborn screen- 
ing and early intervention, people with 
sickle-cell disease have begun living longer. 
The condition is not as stigmatized as it once 
was. Akinyanju proudly ticks off friends with 
sickle-cell disease who have lived into their 
60s and beyond, becoming doctors, judges 
and world travellers. 

The World Health Organization and the 
American Society of Haematology have also 


worked to bring the disease to the attention of 
researchers and pharmaceutical companies. 
And Bakatunkanda and other immigrants have 
raised awareness in wealthier nations, says 
Wonkam. There are signs that this attention 
is paying off. On 25 November, the US Food 
and Drug Administration approved a drug 
for sickle-cell that aims to reduce clumping 
between haemoglobin molecules. 

But although gene therapy might seem a 
rational approach for one of the world’s best- 
known genetic diseases, the field has faced 
its setbacks. Early attempts were marred by 
the high-profile death of Jesse Gelsinger in 
1999, who was participating in one of the first 
gene-therapy clinical trials. A procedure used 
during the trial to replace immune-system 
genes in blood stem cells caused leukaemia 
in several of the participants. 

Against that backdrop, some felt that it was 
premature to apply gene therapy to sickle-cell 
disease, says haematologist David Williams at 
Boston Children’s Hospital in Massachusetts. 
“Sickle cell is not an immediately lethal dis- 
ease,” he says. “In some ways, it wouldn't be 
ethical to treat those patients with a highly 
risky experimental approach.” 

Furthermore, the tools were not yet up to the 
task, says Donald Kohn, aspecialist in paediat- 
ric bone-marrow transplants at the University 
of California, Los Angeles. If researchers were to 
shuttle inanormal haemoglobin gene, it would 
need to be able to crank out large amounts of 


HOW TO STOP SICKLING 


When blood-forming stems cells have mutations in the gene for B-globin, they produce red blood cells that can 
become hardened and sickle-shaped. Gene therapies for sickle-cell disease aim to remove these stem cells from 
the bone marrow, alter their genomes and then replace them in the patient. 


THE PROBLEM 


The gene that 
produces B-globin, 
a component of 
haemoglobin, has 
mutations that 
cause the protein 
to clump. 


7—Cells have healthy 
copies of the 
genes needed to 
produce fetal 
haemoglobin, but 
these are switched 
off in adults. 


THE SOLUTIONS 


B-GLOBIN RESCUE 

Scientists engineer a virus to 
deliver a functional copy of the 
gene that produces B-globin. 
The gene has been modified to 
prevent sickling. 
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Blood-forming 
stem cell 


y-GLOBIN ADDITION 

A virus delivers genes that 
produce y-globin, a component 
of fetal haemoglobin. The 
genes are modified to remain 
switched on in adult cells. 
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BREAKING THE OFF SWITCH 
Various gene-silencing and 
editing technologies turn off 
production of BCL11A, a 
protein that normally prevents 
expression of y-globin genes. 
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protein to sufficiently mute the effects of the 
sickled version. Early gene-therapy technolo- 
gies were not able to express genes in human 
cells at such high levels, says Kohn. 

But despite the setbacks, some gene-therapy 
researchers pushed on, developing safer and 
more potent ways to shuttle genes into cells. 
They broke through in 2016, when the Euro- 
pean Commission approved a gene therapy 
for treating ADA-SCID, a rare immune disor- 
der that often kills children before their first 
birthday. Then in 2017, the US Food and Drug 
Administration approved a gene therapy to 
treat arare form of blindness. 

By this time, some researchers had turned 
their attention back to sickle cell, armed with 
improved tools and with the backing of the 
biotechnology industry. Current trials are tak- 
ing a variety of approaches. Kohn is trying to 
insert a copy of the B-globin gene that has been 
modified to resist sickling. So is Bluebird Bio, 
acompany in Cambridge, Massachusetts. The 
firm looks set to be the first to win approval to 
market sucha treatment in the United States, 
according to Yaron Werber, a biotechnology 
analyst at Cowen, afinancial-services company 
in New York City. 

Others are introducing modified copies of 
the genes that encode fetal haemoglobin, a 
form of the protein that is produced in the 
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developing fetus but usually shuts off soon 
after birth. Fetal haemoglobin is an attrac- 
tive option because it works about as well as 
the adult version, and it prevents defective 
haemoglobins from clumping together. 
Athirdapproachseeksto blockamechanism 
that switches off production of fetal haemo- 
globin after birth. The usual off-switch is a pro- 
tein called BCL11A, and suppressing it in mice 
with sickle-cell disease can keep fetal-haemo- 
globin levels high well into adulthood and 
prevent symptoms of the disease‘. In Boston, 
Williams has licensed technology to Bluebird 
Bio that uses a technique called RNA inter- 
ference to dial down expression of the gene 
encoding BCL1JA in blood stem cells. Sangamo 
Therapeutics in Richmond, California, in 
partnership with Sanofi in Paris, is using 
gene-editing tools called zinc-finger nucleases 
to create mutations that disable the gene. And 
Vertex Pharmaceuticals in Boston has teamed 
up with CRISPR Therapeutics in Cambridge, 
Massachusetts, to do much the same using the 
CRISPR-Cas9 gene-editing technique. In all 
three approaches, blood-producing stem cells 
are removed from the body, genetically altered 
— often with the help ofa virus — and then rein- 
troduced into the bone marrow. Before the 
cells are replaced, participants are typically 
treated witha chemotherapy called busulfan 
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to destroy the remaining diseased stem cells 
and help the reintroduced, genetically altered 
cells to take over. 

That kind of regimen is risky: participants 
can develop acute and severe anaemia. The 
treatment wipes out their white blood cells, 
and wreaks havoc on the lining of the gut, 
potentially leaving them dependent on intra- 
venous nutrition. Many will need to stay in the 
hospital for more than a month. The chemo- 
therapy also causes infertility, and can cause 
cancer later in life. 

This means that gene therapy would 
probably be used only in those with the most 
serious forms of sickle-cell disease. Yet many 
of those people will also have heart, kidney or 
liver damage that would make the chemother- 
apy too dangerous. 

Sickle-cell disease complicates the therapy 
in other ways, too. In many cases, when doctors 
harvest bone marrow, patients first receive a 
drug that makes it easier to collect blood stem 
cells. But that is too dangerous to use in people 
with sickle-cell disease because it raises the risk 
of pain crises. And because diseased red blood 
cells die faster than healthy ones, the stem 
cells in a person with sickle-cell disease must 
work harder to produce new blood cells. This 
can leave them in poor condition for harvest 
and growthin laboratory cultures. Asa result, 
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participants often need blood transfusions just 
before harvest to ease the stress on their stem 
cells. Despite these challenges, early signs of 
success have been making headlines. One of 
the men in Williams’s RNA-interference trial 
has been symptom-free for one year. And the 
first patient in the CRISPR trial has now left 
the hospital after completing the gruelling 
therapy. On 19 November, Vertex and CRISPR 
Therapeutics announced that the person has 
not experienced any pain crises and has main- 
tained a high level of fetal haemoglobin for four 
months. Both trials have generated excitement 
on social media — too much, in some cases. 
“I have difficulty right now with folks being 
excited about the discharge of a patient from 
the hospital, as if that were tantamount toa 
cure,” says Alexis Thompson, a haematologist 
at Northwestern University in Chicago, Illinois. 
“That’s a pretty low bar.” 

Still, there is cause for cautious optimism. 
So far, none of these trials has been stopped 
for safety concerns. And Bluebird Bio has 
treated 13 people, some of whom have been 
monitored for a year after treatment with no 
severe pain crises, the company reported in 
June. The gene therapy used was approved 
in the European Union in June to treat some 
people with a related genetic blood disorder 
called B-thalassaemia. 

But a major concern for many people is 
cost. The treatment for B-thalassaemia runs 
to roughly $1.8 million — not including the 
hospital stay and other associated costs. 

This is still potentially cheaper than standard 
treatments over the course of a lifetime, says 
Mani Foroohar, an analyst at the investment 
bank SVB Leerink in Boston, Massachusetts. 
Also, Bluebird Bio has established an unusual 
fee structure: payments are made over the 
course of five years, and can be halted if the 
treatment stops working. Still, Foroohar says, 
it’s not clear whether the same model will be 
possible in other regions. 

The price tag is certainly well beyond the 
means of many of Jain’s patients in central 
India, who come to her hospital because they 
can’t otherwise afford the roughly $3 per 
month that it costs for standard treatments. 
Even in the United States, access to the gene 
therapies is likely to bea challenge. This is par- 
ticularly true for Black Americans, whotendto 
have more limited access to health care than 
White Americans. Although the trials are still 
in their early days, Krishnamurti urges inter- 
ested people to begin advocating immediately 
for access to the therapies. “It’s an enormous 
ethical issue,” says Krishnamurti, who coun- 
sels people with sickle-cell disease each week 
from his hospital in Atlanta. “In my community 
conversations, I say, ‘You had better be at the 
table, otherwise these decisions will be made 
without your input.” 

At the Cincinnati Children’s Hospital in 
Ohio, haematologist Punam Malik is hoping 


to take the first steps towards making gene 
therapy cheaper and simpler. Malik trained as 
a doctor in India, where she saw many people 
with sickle-cell disease and related conditions. 
When she immigrated to the United States 
about 30 years ago, she vowed to make sure 
that her research would benefit people in 
resource-poor countries. 
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IT SAGREAT LOFTY 
GOAL. | THINK THE 
SCIENCE IS ADVANCING 
PRETTY RAPIDLY. 


Now, Malik is leading atrial that introduces 
stem cells that produce fetal haemoglobin. It 
uses low doses of a drug called melphalan to 
remove diseased cells from the bone marrow, 
which should make the treatment less toxic 
than the usual busulfan. Her hope is that the 
technique will reduce the need for along hos- 
pital stay, making the treatment cheaper, safer 
and more practical. 

But the approach has been criticized by 
others, who worry that the low-dose approach 
might leave behind some uncorrected cells, 
and make the therapy less effective. “You want 
to doas well as you can,’ says Stuart Orkin, who 
studies blood disorders at Boston Children’s 
Hospital. 

Malik counters that once a high dose has 
beenestablished as effective, it is hard to scale 
it back. She points to the example of cancer 
chemotherapy: in some cancers, researchers 
are reducing the dose of some drugs and find- 
ing that they work just as well as, if not better 
than, the higher doses tried initially. But it 
has taken oncologists decades to take that 
step, she notes. “I might fall flat on my face, 
and I might have to dial up. But it will be very 
difficult for the others to dial down,” she says. 

Her trial has also run up against the practical 
realities of exporting gene therapies to regions 
with fewer resources. Her team received 
FDA approval to carry out the trial only at 
Cincinnati Children’s Hospital. But after Malik 
gave atalk ata conference inJamaica, someone 
with sickle-cell disease approached her asking 
for help and describing multiple hospital visits 
for pain crises. 

So, Malik developed a collaboration in 
Jamaica. “I felt we had to,” she says. It took the 
team about two years to get the necessary 
approvals and funding. And then the clini- 
cal team in Jamaica ran up against another 
problem: lack of reliable blood for transfusion. 

The team reported in April that its first 
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patient has experienced only two pain crisesin 
the 18 months since treatment ended and has 
maintained high levels of haemoglobin. The 
team has since treated a second person, and 
two moreare lined up to take part, Malik says. 

The trouble is not only the expense and 
practicalities, but also the availability of cli- 
nicians and facilities who can handle stem-cell 
transplants. Rural regions already struggle to 
supply people with with hydroxyurea, a rela- 
tively cheap medicine that reduces the rate of 
pain crises. It’s hard to imagine these regions 
having enough personnel to monitor recipi- 
ents of gene therapy over the long term, says 
anthropologist Duana Fullwiley at Stanford 
University in California. 

Some argue that itis too early to think about 
such issues. “If we refine the technology, it will 
be affordable in the long run,” says Wonkam. 
“The price right now for meis not the problem. 
The focus needs to be on the efficiency.” 

But others think that the time to start 
thinking about global access is now. To do 
otherwise “would be almost unethical”, says 
NIH director Francis Collins. 

Collins thinks that the key to fulfilling the 
NIH’s project with the Gates foundation will be 
in finding ways to deliver the corrected genes 
or gene-editing tools to bone-marrow stem 
cells that don’t involve having to remove the 
cells first, making therapies cheaper and easier 
to deliver. Itis an ambitious goal — and one that 
is occasionally met with scepticism, Collins 
says. “Sometimes there was a vague sense of, 
‘Boy, you're just outside the boundaries of 
reality there, Collins’” he says. 

There are already suggestions that the 
viruses typically used to shuttle genes into 
cells ina dish can be modified to insert genes 
into blood-producing stem cells while they’re 
stillin the body, notes Kohn. “It’s a great lofty 
goal,” he says. “I think the science is advancing 
pretty rapidly.” 

For Bakatunkanda, his salvation turned 
out to be ageing, not medicine. Some people 
with sickle-cell disease fare worse as children 
thanas adults, he says, and he thinks he is one 
of them. He still has crises, but not nearly as 
often. In recent months, he has taken on activ- 
ities such as hiking and bodybuilding that he 
once thought were off-limits. “Ijust know how 
far 1can push myself,” he says. 

But he would prefer a life without the 
constant threat of pain crises and strokes. He 
is aware of the promise of gene therapies, but 
knows that it is not yet clear whether they will 
provide a cure. “I would prefer that,” he says. 
“But at the moment it’s not a guarantee.” 


Heidi Ledford writes for Nature from London. 
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Triassic rocks in the Italian Dolomites bear evidence of a surprisingly wet episode in Earth’s history. 
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An extended bout of warm, wet weather 232 million years ago might have triggered the rise 
of the dinosaurs and completely altered the history of life on Earth. By Michael Marshall 


lastair Ruffell could see there 
was something odd about the 
rocks near his childhood home in 
Somerset, UK. The deposits hail 
from the Triassic period, more 
than 200 million years ago, and 
most are a dull orange-red, signi- 


fying that they formed when the 
region was a parched landscape, baked by the 
sun. Nothing strange there. But outcrops on 
Somerset’s Lipe Hill have a thin stripe of grey 
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running through the heart of the red stone. 
That band signals atime when arid desert dis- 
appeared and the region transformed into a 
swampy wetland. For some reason, an incredi- 
bly dry climate had turned wet, and stayed that 
way for more thana million years. 

The change intrigued Ruffell when he first 
found the outcrops in the mid-1980s, but the 
young geologist had aPhD project to finish. So 
he put the Triassic puzzle to one side, until a 
chance encounter in 1987 with another young 
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scientist, palaeontologist Michael Simms. 
During his postdoctoral studies, Simms had 
discovered evidence of extinctions in the 
Late Triassic, during Ruffell’s mysterious wet 
period. In the late 1980s, the pair pushed the 
idea that the two findings were connected, but 
for years, their results were dismissed. 
Three decades later, there is a growing 
consensus that they were right, after all. 
Something strange happened in the Late 
Triassic — and not just in Somerset. About 
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232 million years ago, during aspan known as 
the Carnian age, it rained almost everywhere. 
After millions of years of dry climates, Earth 
entered a wet period lasting one million to two 
million years. Nearly any place where geolo- 
gists find rocks of that age, there are signs of 
wet weather. This so-called Carnian pluvial 
episode coincides with some massive evolu- 
tionary shifts. 

Perhaps most dramatically, the Carnian 
pluvial might have overlapped with when 
a rare group of reptiles — early dinosaurs — 
evolved into a diverse group and came to dom- 
inate land ecosystems. The Carnian could have 
paved the way for the spectacular dinosaurs 
that evolved later, including Stegosaurus and 
Tyrannosaurus. 

Other groups also left the Carnian in very 
different shape from how they had entered 
it: reef-building corals and marine plankton 
were all becoming more ‘modern’ — moving 
evolutionarily closer to the forms alive today. 
The period could even have seen the appear- 
ance of the first mammals. “It was almost like 
a turning point between some elements of a 
more ancient world anda modern world,” says 
Simms. 

After years in obscurity, the Carnian pluvial 
is becoming a major research focus. In May 
2017, scientists gathered for the first con- 
ference dedicated to the period, held at the 
Institute for Advanced Study in Delmenhorst, 
Germany. Since then, the Journal of the Geolog- 
ical Society has dedicated two special issues 
to the topic. Over the past decade, many 
researchers have begun studying Carnian 
rocks intensively. They want to understand 
why the climate changed, and why that led to 
such dramatic evolutionary shifts. Evidence 
now points to massive volcanic eruptions. 

This is aremarkable turnaround for an event 
discovered almost by accident in the 1980s. 


Achance encounter 


It began when Simms, now at National 
Museums Northern Ireland in Holywood, 
went to the University of Liverpool, UK, fora 
postdoctoral research fellowship. He studied 
crinoids: marine animals, related to starfish, 
that resemble flowers or feathers. 

Simms focused on crinoids living in the 
Triassic period, which ran from 252 million 
years ago to 201 million years ago. The Triassic 
was bookended by two of the most troubled 
times in Earth’s history: it started right after 
amass extinction at the close of the Permian 
period, and ended with another mass extinc- 
tion at the Triassic—Jurassic junction. 

But Simms was in for a surprise. “By the tail 
end of 1987, ithad become clear that there was a 
quite significant extinction among the Triassic 
crinoids,” he says. But the die-off came tens of 
millions of years before the end of the period. 
This placed the extinction in the Carnian: the 
fifth of seven shorter ages in the Triassic. 


Intrigued, Simms returned tothe University 
of Birmingham, UK, where he had done his doc- 
torate, for a visit. His old office was occupied 
by palaeontologist Paul Wignall and Ruffell, 
who is now at Queen’s University Belfast, UK. 

Ruffell’s studies focused on sediments from 
the later Cretaceous period, but for fun he was 
investigating Triassic rocks called the Mercia 
Mudstone Group, which mostly reflect dry cli- 
mates. It was in the Carnian section of these 
rocks that he found a thin layer of grey sand- 
stone, rich in fossils such as sharks’ teeth. It 
was the remains ofa river or delta. “Slap-bang 
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in the middle of all this horrible arid stuff was 
this probably rather pleasant environment,’ 
says Ruffell. 

When the three were chatting, Simms 
mentioned the Carnian crinoid extinctions. 
According to Simms and Wignall, Ruffell 
replied: “It was raining then. Perhaps the 
crinoids didn’t like the rain.” It was a flippant 
remark, one Ruffell does not remember mak- 
ing, butit struck achord with Simms. Changing 
climates can cause extinctions, so perhaps this 
was the case for the shifts in the Carnian. 

Simms and Ruffell began investigating, and 
found that there was also evidence of a wet 
spell in the Carnian in rocks from Germany, the 
United States, the Himalayas and other places. 
What’s more, it was not just the crinoids that 
faced extinctions: amphibians and land plants 
lost members, too. 1n 1989, the pair published 
evidence for an event they named the Carnian 
pluvial episode’, using the geological term 
referring to rain. 

The results didn’t make much of a splash, 
except from some researchers who attacked 
the idea. “I remember one or two quite senior 
academics thought it was a preposterous idea,” 
says Simms. 

In 1994, a team led by Henk Visscher of 
Utrecht University in the Netherlands pub- 
lished a strongly worded rebuttal claiming 
that althoughsome spots might have grown 
rainy during this time, many environments 
remained dry’. Visscher argued that, instead 
of anincrease in rainfall, the evidence could 
be explained away by “high groundwater 
tables”. 

Rebuffed, Simms and Ruffell changed 
course. “We just moved onto all sorts of other 
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things,” says Simms. Whereas Simms pursued 
acareer in geology and palaeontology, Ruffell 
became an expert in forensic geology. 


Wet world 


However, the Carnian pluvial episode did 
not go away. Geologists in Europe, especially 
Italy, continued amassing evidence for wet 
conditions around 232 million years ago. The 
coincidences piled up. 

Inthe Late Triassic, the world looked nothing 
like today’s (see ‘Time for achange’). The land- 
masses were all connected, forming a ‘super- 
continent’ called Pangaea, where the climate 
was hot and dry, especially in the interior. 
Land ecosystems were dominated by reptiles, 
including the first dinosaurs. There were no 
flowers, grasses or birds. 

There werealsonomammals, but the Carnian 
might have been when that changed. In 2005, 
P. M. Datta at the Geological Survey of Indiain 
Kolkata described a single mammal tooth from 
Carnian-aged rocks in India®. Another tooth, 
discovered in Carnian rocks inGermany, might 
also have belonged toa mammal". 

The origin of mammals is a topic that 
triggers strong debate. Wignall, who is now 
at the University of Leeds, UK, says they could 
have appeared during the Carnian, butit’s also 
possible that there are earlier ones we have 
not found yet. And many palaeontologists 
argue that true mammals did not emerge 
until the Jurassic, millions of years later. If so, 
the Carnian fossils are not from mammals, 
although they could be from their ancestors. 

Whatever the case is with mammals, a 
string of discoveries in the past decade or so 
offers strong evidence for other evolutionary 
shifts. Researchers reported in 2013 that the 
Carnian saw the origin of marine organisms 
called calcareous nannoplankton®. These sin- 
gle-celled organisms surround themselves 
with hard shells of calcium carbonate. Today, 
they form huge blooms and are known as the 
‘grass of the sea’. They have a major role in 
cycling carbon between the air and the ocean. 

Another group that underwent major 
changes in the Carnian was the scleractinian 
corals, which build today’s giant coral reefs. 
Scleractinians emerged earlier in the Trias- 
sic, but it was not until the Carnian or shortly 
afterwards that they began constructing big 
reefs. Isotopic evidence and other clues in 
fossil corals suggest this could be when they 
acquired their modern symbionts: photosyn- 
thetic algae that supply them with nutrients®. 


Afiery time 


Not everyoneis convinced that the world went 
through a warm, wet phase in the Carnian. 
“Still, | have my doubts,” says Visscher. He 
accepts that the climate changed, but says 
rainfall could have become more seasonal, 
leading to annual blooms of vegetation. Sim- 
ilarly, Matthias Franz at the Georg-August 
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TIME FOR A CHANGE 


After millions of years of generally dry climate, 
the planet went through several humid phases 
that together spanned more than a million years 
during the Carnian age of the Triassic period. 
The climate change coincided with massive 
eruptions — resulting in what are known as the 
Wrangellia flood basalts — in what is now 
northwestern North America. The upheaval 
might have spurred massive evolutionary 
changes, including the emergence of new 
dinosaur groups. 
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University of Gottingen, Germany, has found 
evidence that the extra damp might have 
been caused by rising seas’, at least in parts 
of Europe, although it is not clear that this can 
account for the changes elsewhere. Still, Franz 
emphasizes that the period is significant any- 
way. “There is obviously something happening 
at this time,” he says. “The question is what.” 

Simms and Ruffell had previously suggested 
that volcanic eruptions were responsible for 
the climate change, and geologists knewthere 
was a prime candidate: the cataclysm that 
created massive basalt formations — several 
kilometres thick in places — running from Brit- 
ish Columbia in Canada to Alaska. 

Dubbed the Wrangellia Terrane, after 
Alaska’s Wrangell Mountains, these lavas are 
part of aLarge Igneous Province formed by vol- 
canoes spewing out huge volumes of lava over 
hundreds of thousands or millions of years, 
around 232 million years ago. The volcanoes 
were submarine, but emerged above the water 
as lava continued to pour out, says Andrea 
Marzoli at the University of Padua in Italy. 

If these vast eruptions happened at the 
same time as the Carnian pluvial episode, they 
could have released enough carbon dioxide to 
warm the globe. And that could have increased 
rain, by enhancing evaporation from seas and 
rivers. Some scientists have come to regard the 
name Carnian pluvial as misleading, because 
the main change at the time would have been 
an episode of global warming. 

“The natural thing to do was to understand 
if this increase in rainfall, that was seen every- 
where, was triggered by injection of CO, inthe 
atmosphere,’ says Jacopo Dal Corso, a geolo- 
gist at the University of Leeds. His team ana- 
lysed samples of carbon-rich Carnian material 
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from the Italian Alps. In 2012, the researchers 
reported unusually low levels of carbon-13, a 
heavy isotope of carbon, during the Carnian 
pluvial®. This indicated that a huge volume of 
the lighter isotope, carbon-12, was injected 
into the atmosphere — and eruptions in 
Wrangellia could have been the prime source. 

Subsequent studies have backed Dal Corso’s 


Aa 


ONE OF THE FASCINATING 
THINGS ABOUT THIS 
INTERVAL IS HOW MANY 
MODERN GROUPS 
NPPEAR 


claim that the carbon cycle was perturbed dur- 
ing the Carnian for about one million years’, 
owing to the eruptions. But for some, the link 
remains tentative because uncertainty in the 
dating of rocks makes it hard to definitively 
say that the Wrangellia eruptions happened 
at the same time as the climatic and evolution- 
ary changes in the Carnian. Wignall says this is 
because the Carnian has not yet been studied 
intensively; uncertainties can span one million 
years. Marzoli plans to sample Wrangellia next 
summer, partly to clarify its age. According to 
him, Wrangelliais the most likely explanation 
because there are no other candidates. 

Meanwhile, the list of evolutionary changes 
that happened in the Carnian pluvial continues 
to grow. 
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The most dramatic claim is that the Carnian 
was crucial for the dinosaurs’ rapid evolution- 
ary expansion. Evidence indicates that dino- 
saurs emerged before the Carnian, about 
245 million years ago, but those earliest crea- 
tures are very rare and only a few species are 
known. 

What’s clear is that dinosaurs changed 
drastically. At the start of the Carnian, they 
were all small and bipedal. But by the end, the 
two major groups had emerged. These were 
the ornithischians, which later included Steg- 
osaurus and Triceratops; and the saurischians, 
which gave rise to huge, long-necked species 
such as Brachiosaurus, and theropods suchas 
Tyrannosaurus rex and birds. Mike Benton, a 
palaeontologist at the University of Bristol, UK, 
and his colleagues documented some of these 
changes by using well-dated samples from the 
Alps to create a high-resolution timescale of 
animal tracks in the Late Triassic’°. The early 
Carnian was dominated by reptiles called cru- 
rotarsans. But by the end of the Carnian, the 
dinosaurs dominated. This shift took just 4 
million years, and coincided with the pluvial 
episode. And after that rapid rise, dinosaurs 
ruled the world for more than150 million years. 

With all these changes happening, and 
the fuzzy dating of rocks from the Carnian, 
researchers are struggling to create a coherent 
picture of how the climate changed and how 
that affected ecosystems. But the Carnian has 
become a hot topic. “One of the fascinating 
things about this interval ishow many modern 
groups appear, from vertebrates all the way 
down to plankton,” says Wignall. 

This was one of life’s major transitions. 
The planet was still recovering from the 
end-Permian extinctions and the Carnian saw 
the rise of groups that have ruled the world 
ever since. 

The two researchers who started this whole 
affair are surprised and delighted by what has 
happened. Simmsis content to watch from the 
sidelines, but Ruffell has resumed studying 
Carnian geology. The irony, Ruffell says, is that 
his Carnian studies were only a hobby. This 
dramatic period that shook up evolutionary 
history was found, he says, by “acouple of guys 
who really shouldn’t have been working on it 
inthe first place”. 


Michael Marshall is a science journalist in 
Devon, UK. 
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Why pipelines persist amid 
geopolitical turmoil 


Inanewbook, Thane Gustafson analyses the 
Russia—Europe gas trade. By Andrew Moravesik 


any people imagine that geopolitics 
drives the energy trade between 
Russia and Europe. As the story 
goes, each side seeks to exploit gas 
and oil toinfluence the other inthe 
big game of power politics — and Russia seems 
to have the upper hand. The European Union 
now imports nearly 40% of its natural gas 
from Russia. For decades, national-security 
specialists have recommended that Europeans 
reduce their dependence on these imports at 
any cost. Most recently, a fierce debate over 
Nord Stream 2 — a second Russian pipeline 
across the Baltic Sea to Germany — has led US 
congresspeople to threaten sanctions. 

Political scientist Thane Gustafson chal- 
lenges this view in The Bridge. He argues that 
the trade in gas reflects slow-moving patterns 
of market demand and supply, which in turn 
stem from incremental changes in technology 
for pumping, piping and consuming fuel. The 
result is a pattern of remarkably stable eco- 
nomic interdependence that seems impervi- 
ous to the geopolitical environment. 

As extraction and pipeline technology 
opened up Soviet gas fields in the 1960s, and 
the ongoing postwar reconstruction of Europe 
stoked demand, East-West gas trade became 
all but inevitable. Ever since, Russia has wanted 
to supply gas and Europe has wanted to buy 
it. The past 50 years have seen energy shocks 
and gluts; major political crises from Poland 
to the former Yugoslavia; the fall of the Soviet 
Union and rise of Russian President Vladimir 
Putin’s authoritarian state; outright warfarein 
Ukraine and elsewhere; massive experiments 
in deregulation; and the rise of environmental- 
ism. Yet relations between Europe and Russia 
inthe natural-gas sector have remained nearly 
constant. This is because change is slow in 
three factors: proven reserves of gas, aggre- 
gate demand for energy and investment in 
physical infrastructure to link the two. 

The Bridge is an overview rather than a 
work of original research. Yet it offers a read- 
able, intelligent, even-handed historical 
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interpretation of this modern economic rela- 
tionship. It divides East-West natural-gas 
relations neatly into three distinct periods. 

The first begins around 1960, with the 
spread of transport and use of natural gas in 
Europe, originally limited to small local net- 
works in Italy and the Netherlands. Backed by 
US expertise, Europeans began to consider 
long-distance gas pipelines from Siberia, and 
made Western industrial equipment, invest- 
ment and technical know-how available to the 
Soviet Union. The rigidity of the Communist 
system meant that production took almost 
a decade to come online. Eventually, the gas 
arrived, at first flowing through a terminal in 
Austria. 

The second period begins after 1970, when 
the quantity of Russian natural gas entering 
Europe increased. European consumption 
expanded quickly; gas proved cheaper and 
environmentally cleaner than coal or oil. Other 
countries, notably undersea-gas producers 
Norway and Britain, also created highly cen- 
tralized systems for exploiting and piping the 
fuel. Yet the vast, low-cost Russian reserves 
enjoyed a comparative advantage, rising to 
provide almost half of consumption in Euro- 
pean countries, prominent among them 
Germany and Italy. 

This period, Gustafson argues, 
demonstrates the exceptionally stable 
nature of this type of international economic 
cooperation. Pipelines take decades to build, 
then tend to operate for decades more, often 
governed by just one or two long-term con- 
tracts. The physical, tangible linkage between 


The Bridge: Natural 
Gas in a Redivided 
Europe 

Thane Gustafson 
Harvard Univ. Press 
(2020) 


The Bridge 


THANE GUSTAFSON 
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producer and consumer “automatically 
creates a mutual dependence’, he writes. 
Moreover, because pipelines are centralized, 
they encourage domination of the market by 
monopolies — inthe 1970s, these consisted of 
the Soviet Ministry of the Gas Industry, and 
European national or regional utilities. Natural 
gas, or anything else that travels through a 
fixed infrastructure, becomes a “relationship 
commodity”: investments, personal contacts 
and market shares follow the technology. 
This, Gustafson avers, is why the East-West 
gas trade has remained impervious to geo- 
political disruption. In 1968, shortly after the 
Soviet invasion of Czechoslovakia, Austria 
accepted the first Russian gas shipments 
into Europe. In 1981, when the pro-demo- 
cracy Solidarity movement in Poland led 
to the Soviet-backed imposition of martial 
law, the US administration under president 
Ronald Reagan imposed sanctions on exports 
of pipeline technology. It could afford to do 
this because it was largely uninvolved in the 
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Russian natural-gas pipelines in northwestern Siberia. 


East-West energy trade. 

Yet behind the scenes of these political 
upheavals, the real stakeholders acted dif- 
ferently. The Soviet Union developed home- 
grown alternative compressor and pipe 
technology — crucial for transporting gas — 
and Europe continued to sell technology that 
the Soviets could not produce at home. 

The third period began around 1990. Geo- 
politics grew more unruly. The Soviet Union 
collapsed in 1991. The gas ministry was turned 
into the massive state-owned corporation 
Gazprom, which was then largely privat- 
ized. Putin, who became president in 2000, 
brought Gazprom back under near-total state 
control. Russia also provoked a series of inter- 
ventions and conflicts in Georgia, Moldova, 
Syria and Ukraine. The West responded by 
imposing sanctions — limits on investment 
and exports in sensitive military and civilian 
technologies, and even on energy investment. 
Russia’s countersanctions largely targeted 
Western agricultural exports. More recently, 


Russia has become involved in the disruption 
of elections in the West, and in cyberwarfare. 
Yet gas quietly continues to flow through the 
East-West pipeline. 

Much of the book’s analysis of the most 
recent period focuses on another potentially 
disruptive change: new EU regulations. 
Gustafson makes much of the fact that, 
30 years ago, the European Commission began 
pressing to open up the European energy mar- 
ket to greater competition. Directives render 
prices more transparent and uniform, and 
compel firms to supply gas across borders. At 
thesametime, the commissionis acting more 
forcefully to limit monopolies and cartels, and 
domestic deregulation has led to the rise of 
new corporate players. 

Overall, this concerted EU policy has further 
strengthened Europe’s hand. Russia cannot use 
embargoes or market segmentation to exploit 
individual countries. And Gazprom — which 
still has a near-monopoly on Russian exports, 
even thoughit is losing domestic market share 
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— cannot acquire dominant positionsin Europe. 
Thisis a significant development — and, froma 
Western perspective, a positive one. 

Yet it is difficult to discern how EU policies 
have altered Russia’s gas trade with Europe in 
any fundamental way. Exporting and import- 
ing nations alike have found ways to maintain 
overall control of their markets. If anything, 
Gustafson’s analysis would seem to show 
that the primary impact of EU consolidation 
has been to insulate a mutually beneficial 
economic status quo from disruption. 

Gustafson ends by considering long-term 
threats, which he introduces only to dismiss. 
For 20 years, conflict with Ukraine — first 


“Russia has become involved 
in the disruption of elections 
in the West, yet gas quietly 
continues to flow.” 


over energy pricing, then over politics — has 
led Russia to propose new pipelines that geo- 
graphically circumvent its neighbour. Many 
worry that new lines, such as Nord Stream 2, 
might cut Ukraine out entirely. Yet Gustafson 
remains confident that if this occurs, Kiev, 
already transitioning away from Russian 
natural gas, will find new suppliers. 

Another threat comes from new techno- 
logical options for transporting fuel as liq- 
uid natural gas, a more fungible form that 
would permit US imports to Europe. This 
might create an alternative to stable pipeline 
politics, although the transition would be 
slow because of the higher cost of the tech- 
nology. Also, environmental-protection and 
climate-change concerns will continue to rise, 
reducing European demand inthe long term. 
Yet, in the interim, natural gas will remain 
abundantly available, relatively inexpensive 
and still environmentally superior to oil, coal 
or nuclear power. 

Gustafson’s overall conclusion is thus that 
Russian gas is likely to remain Europe’s major 
energy bridge to a future world of renewables. 
He even sees the next few decades as a “golden 
age of gas”. This isa soberly optimistic conclu- 
sion, not least because it suggests that com- 
mercial interests will induce modern countries 
to transcend ideological and geopolitical dif- 
ferences. 


Andrew Moravesik is professor of politics and 
international affairs, and director of the EU 
Program, at Princeton University in New Jersey. 
e-mail: amoravcs@princeton.edu 
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Setting the agenda in research 


Comment 


Women from some 
minorities get too few talks 


Heather L. Ford, Cameron Brick, Margarita Azmitia, 


Karine Blaufuss & Petra Dekens 


Researchers from racial 

and ethnic groups that are 
under-represented in US 
geoscience are the least likely 
to be offered opportunities 
to speak at the field’s biggest 
meeting. 


iases — structural, implicit and 
explicit — exclude many people from 
science, technology, engineering 
and mathematics (STEM) education 
and employment, and devalue their 
contributions!*. Most studies focus on bias 
against women. Few data sets offer enough 
generalizability or statistical power to eval- 
uate the representation of minority ethnic 
and racial groups, or to examine intersec- 
tionality’. The latter describes the interwoven 
forms of discrimination that affect a person 
from multiple marginalized groups (such as 
racism, sexism, classism or ageism), locate 
them insystems of oppression and limit their 
upward mobility — as might be experienced by 
a young African American woman in science 
inthe United States. 
We offer just such a data set here. 
Presenting at scientific conferences is key 
to academic career progression. Scientists 
don’t just communicate results; they also 
develop relationships with collaborators and 
mentors, and identify job and funding oppor- 
tunities. Giving a talk confers recognition 
and prestige, particularly for students and 
early-career researchers. Despite historical 
inequities, women are now presenting more 
at conferences** and colloquia®. These gains 
are especially visible at conferences that 
are organized by women or that specifically 
support early-career participants. 
We found that US scientists from minor- 
ity racial and ethnic populations already 
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under-represented in science had relatively 
fewer speaking opportunities at a key scien- 
tific conference over a four-year period than 
their proportioninthesample would predict; 
the imbalance was most severe for women. 
This disadvantage for under-represented 
minority groups held across career stage (see 
‘Who gets the microphone?’). 

Our results underscore the pressing need to 
support minority groups at conferences — as 
elsewhere in STEM — to advance equity and 
improve research. 


Dataset and methods 


The American Geophysical Union (AGU) is 
an international non-profit scientific asso- 
ciation with around 60,000 members in 
137 countries. Since 2013, the AGU has col- 
lected self-reported demographic data from 
its membership, including gender, race or 
ethnicity (for US-based academics only), 
career stage and birth year. 

The AGU Fall Meeting is the world’s larg- 
est Earth- and space-science conference. 
The attendance each year from 2014 to 2017 
was approximately 24,000-28,000 people. 
Around 22,000 abstracts are submitted for 
selection as talks or posters each year; few are 
rejected (<0.05%). Membership is necessary 
for submitting, although not for attending 
the meeting. 

Abstracts are submitted to topical sessions. 
Sessions are proposed and organized, and 
abstracts vetted, by a group of conveners — 
academics, industry members, government 
scientists and others. The primary convener 
must be an AGU member. There are three 
tracks by which geoscientists get to present 
at the meeting — two by submission, one by 


“Giving atalk confers 
recognition and prestige, 
particularly for students and 
early-career researchers.” 
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invitation. Authors can submit abstracts to 
conveners, who decide which will become 
talks and which posters; or authors can submit 
abstracts just to give a poster. In addition, 
session conveners directly invite scientists 
to speak (strictly, to send in abstracts, which 
generally results ina talk). 

The database of 87,544 accepted abstracts 
from the meetings between 2014 and 
2017 offers a unique opportunity to probe 
inequities of opportunity between demo- 
graphic groups’. Presentations are approxi- 
mately 34% talks (about one-third of which are 
directly invited) and 66% posters. 


Career stage 

Of US-based authors, 98% (n = 53,247) 
provided career information. Researchers had 
verified themselves as students (undergradu- 
ates and graduates) or the AGU had calculated 
career stage from years since highest degree 
obtained: early career (0-10 years); mid-career 


Some scientists opt to present posters, others are assigned them instead of being asked to talk. 


(10-25 years); and experienced (late career; 
more than 25 years). Controlling for career 
stage is crucial because minority racial and 
ethnic groups are concentrated inthe student 
and early-career stages (see ‘Fewer seniors’). 
This is due to both a leaky pipeline in educa- 
tion and professional advancement’ and the 
fact that senior groups more strongly bear the 
imprint of historical biases. 


Race, ethnicity, gender 

The AGU recorded self-reported ethnicity and 
race from US-based authors only (n=54,446). 
Of these, 71% (n = 38,768) reported acategory 
(defined as per the US census, see Supplemen- 
tary information): White (58%), Asian Amer- 
ican (7.3%), Hispanic/Latino (3.9%), African 
American (1.1%), Native American (0.3%) or 
Pacific Islander (0.2%). The remainder marked 
‘other’ (13%) or ‘prefer not to answer’ (13%), 
or didn’t respond (2.8%). We did not verify 
whether Native American respondents were 


citizens of tribal nations; we acknowledge that 
self-reported identity is not the same as tribal 
citizenship. ‘Other’ could refer to individuals 
who are multiracial or who do not identify 
with the categories listed. Before analysis, we 
decided to exclude authors who were based 
outside the United States (n = 33,098), who 
identified as ‘other’ or who did not report 
ethnicity or race. 

Of our sample of US-based authors who 
reported their race and ethnicity, more than 
99% (n = 38,716) identified as female or male 
(the third option was ‘prefer not to answer’). 
We appreciate that this binary treatment does 
not incorporate the full spectrum of gender 
and sexual identity. 


Under-represented groups 

Minority ethnic and racial groups make up 
31% of the US population’. People from these 
groups are under-represented in the STEM 
workforce (11%), and specifically in the physical 
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sciences, at 9% (ref. 9). In the AGU abstracts 
data set, African American, Hispanic/Latino, 
Native American and Pacific Islander com- 
prise 7.7% of the first-author abstracts in the 
analysed sample. We combined them into one 
measure — under-represented minority groups 
(URMs). We did so to increase the statistical 
power to detect differences, to limit the risk 
of multiple comparisons generating false 
positives and to avoid including potentially 
identifying information for people from rare 
groups. We admit that this approach erases 
meaningful differences in lived experiences 
between people in these groups, particularly 
those with the lowest representation. Scien- 
tists from each minority group have unique 
barriers to participation. 

We combined the groups White and Asian 
American into non-URM. We did so because 
Asian Americans (4.8% of the US population’) 
are well represented in the STEM workforce 
(20.6%), in physical sciences (17.5%)? and in 
the analysed sample (10.2% of first-author 
abstracts). We appreciate that this bracket- 
ing, too, is suboptimal — it also erases many 
meaningful differences, pressingly that Asian 
American researchers do face career barriers, 
including implicit and explicit biases” (see 
Supplementary information). 


Results 


Our analyses focus onthe chances of scientists 
from minority racial and ethnic groups that 
are under-represented in Earth and space 
sciences being given speaking opportuni- 
ties, compared with other applicants. The 
key proportions are normalized relative tothe 
population of each group, so that the results 
indicate representation (see Supplementary 
information for all inferential statistics). 

First authors from under-represented 
minority groups contributed 7.7% of all the 
abstracts in the sample (n = 2,981; see ‘Fewer 
abstracts’). The URM applicants were dispro- 
portionately students or early-career scien- 
tists (79% compared with 59% of non-URM 
authors; see ‘Fewer seniors’). At some career 
stages, the small number of URM research- 
ers sometimes led to low statistical power to 
detect differences. 

URM authors were invited to give talks less 
often than were other authors (8% versus 14%, 
normalized; see ‘Too few talks’). Crucially, this 
was statistically significant in the early-career 
stage (and overall). 

From talk-or-poster submissions, URM 
authors were assigned talks less frequently 
than were other scientists (42.9% versus 50.8% 
normalized in each population; see ‘Too few 
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talks’). Again, this difference was statistically 
significant in the crucial early-career stage 
as well as overall. Compared to others, URM 
authors were more likely to apply to give 
only a poster (35% versus 24%; see ‘Too few 
talks’). This was significant overall and for 
each career stage. 

Female URM authors had strikingly few 
opportunities at the AGU Fall Meetings. They 
had even less chance of being invited to talk 
(and applied for posters more often) than had 
URM men (and non-URM women), and were 
assigned talks less often than were non-URM 
women (see ‘Fewest chances’). This is despite 
the fact that women (taking all races and eth- 
nicities together) had equal or more opportu- 
nities to speak than men had (see ‘Equity — why 
so slow?’)>. 

To sum up, scientists from under-repre- 
sented racial and ethnic minority groups 
had the smallest chances of being selected 
and invited to speak, and opted for poster 
presentations more often than did their peers. 


Caveats and confounders 


We did not assess abstract quality. An alterna- 
tive explanation for our results could be that 
URM scientists submitted abstracts of lower 
quality. Even if the AGU’s selection were per- 
fectly meritocratic, any gap in abstract qual- 
ity would still, in our view, suggest bias in 
the STEM pipeline — for example, as a result 
of discrimination in earlier education’ and 
career development. These obstacles result 
in fewer URM scientists than scientists from 
other groups holding positions at elite insti- 
tutions that provide excellent resources and 
strong collaborators. 

Because of small sample sizes, it was not 
possible to control for career stage when we 
analysed by gender (see ‘Fewest chances’). 

We did not investigate why URM geoscien- 
tists applied to give only a poster more often 
than did others overall, and at every career 
stage. There could be several reasons. People 
might be held back by psychological factors 
such as lower self-confidence”. For example, 
people from under-represented minority 
groups often report ‘impostor syndrome’ — 
feeling isolated and vulnerable in academia 
because they perceive themselves as having 
lower competence than their peers". Or, some 
URM scientists might value poster presenta- 
tions — this format could align with different 
goals, interests or lived experiences, for exam- 
ple by enabling researchers to communicate 
findings in one-on-one conversations. 

Because we left out of our analysis those 
based outside the United States, those who 
identified as ‘other’ and those who did not 
report ethnicity or race, our results will prob- 
ably have excluded relevant individuals — peo- 
ple who identify as multiracial, for example. 
Our main analyses therefore represent a 
conservative test of speaking opportunities 
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between minority and majority groups. 
Notably, combining Asian Americans 
with under-represented minority groups 
would have yielded figures that, at face 
value, looked more representative. We did 
not do this because the US National Science 
Foundation (NSF) does not include Asian 
Americans as an under-represented group 
in STEM; its policy efforts are focused on 


the under-represented minorities we track 
here. In the Supplementary information, we 
report separate exploratory analyses con- 
cerning Asian Americans, and examine career 
stage further, because of geoscience-specific 
nuances in the recruitment and representa- 
tion of people who identify thus”. 

We must also point out that other nations 
might apply different census definitions to 


WHO GETS THE MICROPHONE? 


Some minority scientists who are already under-represented in science, technology, engineering and mathematics 
(STEM) and in geoscience are increasingly under-represented at every step on the path to speaking at the 
American Geophysical Union’s Fall Meeting — in terms of abstract submissions, seniority and being offered talks. 


SUBMISSIONS 


Fewer abstracts 


Authors from under-represented minority groups (URMs; 
see main text for definitions) submitted the smallest 
proportion of abstracts in total and by career stage. 


@ URM authors Non-URM authors 


20 


Proportion of all abstracts (%) 
fep} 
fo) 


Total Mid- 


career 


Student Late 


career 


Early 
career 


OPPORTUNITIES 


Too few talks 
URM authors were invited or assigned to speak less 
often than were other authors, at most career stages. 
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By contrast, a larger proportion of URM authors than 
others applied for the poster-only option. 
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Fewer seniors 

Among URM authors, a bigger proportion of abstracts 
are submitted by students and early-career scientists 
than from non-URM authors at these career stages. 


URM authors 
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FEWEST CHANCES 


URM women comprise the group that is least likely to 
be invited or assigned to speak. But they are 
over-represented in requesting to present posters. 
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*Statistically significant difference 
from URM women (P <0.05). 
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those used here. For example, ‘White’ in the 
United States encompasses people who have 
origins in the Middle East or North Africa. 


Next steps 


To recap: a woman starting out in her career 
from a racial or ethnic minority group that is 
under-represented in US geoscience is less 
likely to gain a speaking slot at the field’s 
largest conference than are her male peers 
and her non-Hispanic White peers of both 
sexes. These findings hold sobering lessons 
for the AGU and other STEM conferences and 
activities. We pre-registered our data cleaning 
and main confirmatory analyses at the Open 
Science Framework to increase generalizabil- 
ity (see Supplementary information). 

One of the AGU’s goals for inviting speak- 
ers is to “enhance diversity and/or feature 
early-career scientists”. It is particularly con- 
cerning that where URM authors are most 
numerous — in the least-established career 
stages — they get fewer invitations than their 
proportion would predict. Such early inequi- 
ties are likely to affect the retention and pro- 
motion of people from under-represented 
minority groups across geoscience. 

There are three clear steps for the AGU to 
take. First, conference conveners should be 
blinded to information that is not necessary 
to rate the quality of submissions. Identify- 
ing details such as names and institutions 
introduce bias“ even in people committed 
to equity, because many thinking processes, 
such as stereotype activation, occur outside 
awareness or control. Double-blind review 
has decreased bias in allocating time on the 
Hubble Space Telescope”. 

Second, the AGU should encourage more 
scholars from under-represented minority 
groups to participate as conveners. Third, the 
AGU should provide more travel grants to URM 
presenters, which could increase the overall 
population of URM attendees both directly 
and by shifting norms. We encourage other 
STEM conferences to make these changes. 

Meanwhile, the rest of the community has 
work to do to (see ‘Equity — why so slow?’). 
Established scholars can support scientists 
from minority groups by encouraging them 
to submit talk abstracts and by providing 
opportunities to practise presenting in local, 
domestic and international venues. These 
steps can increase confidence and foster the 
development of people’s identity as scientists. 

It is crucial for universities and funding 
agencies to support organizations that pro- 
vide openings and mentorship to young schol- 
ars from minority groups, suchas the Society 
for Advancement of Chicanos/Hispanics and 
Native Americans in Science. The NSF aims 
to broaden participation in STEM through 
its criteria for grant proposals and through 
initiatives such as NSF INCLUDES (Inclusion 
across the Nation of Communities of Learners 


Equity — why 
so slow? 


and the growing literature on effective inter- 
ventions, together we can create a more 
equitable scientific community. 


Laws, policies, training, research 
and tracking must benefit all. 


In the United States, affirmative action 

is a set of laws, guidelines and policies 
that aim to increase the representation 
of historically excluded groups in higher 
education and professional careers. 
Overall, White women have been the 
primary beneficiaries”, as our results 
underscore. 

A report last year by the US National 
Science Foundation showed that minority 
ethnic and racial groups are under- 
represented in graduate programmes, and 
that this results in reduced economic and 
social opportunities". 

An inclusive environment, visible role 
models and adequate funding are key to 
enabling people from under-represented 
minority groups to participate and succeed 
in science, technology, engineering and 
mathematics (STEM)'®. A growing body 
of research has highlighted the subtle, 
indirect and often unintentional actions 
perpetrated against such researchers by 
majority groups, and which have an impact 
onasense of belonging in STEM spaces", 
as well as on career persistence and well- 
being”””. 

Small interventions can help, such as 
asking STEM community members to be 
mindful of equity, diversity and inclusion. 
Reminding individuals, particularly men, 
to consider diversity when selecting 
potential reviewers can improve gender 
representation”. 

However, the effects of these reminders 
on ethnicity bias have not been studied, 
and reminders might not be effective in 
the long term in reducing implicit biases 
in STEM~. Implicit-bias training is well- 
meaning but largely ineffective”®””. H.L.F, 
C.B. et al. 


of Underrepresented Discoverers in Engineer- 
ing and Science)'°. Such programmes can liaise 
with professional societies. 

Racial, ethnic and gender biases harm indi- 
viduals and undermine the quality of science. 
Even if all demographic gaps were plugged 
tomorrow at the level of people graduating 
with PhDs, and even if these graduates did not 
have torun the gauntlet of systematic bias that 
their predecessors faced, it could still take 
generations to achieve fair representation 
among senior academics. 

We therefore urge more organizations to 
measure and share the outcomes for scholars 
from minority groups. With this information 
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Police patrol a food market at night in Kashgar in China’s Xinjiang province. 


Crack down on 
genomic surveillance 


Yves Moreau 


Corporations selling 
DNA-profiling technology 
are aiding human-rights 
abuses. Governments, 
legislators, researchers, 
reviewers and publishers 
must act. 
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cross the world, DNA databases 
that could be used for state-level 
surveillance are steadily growing. 
The most striking case is in China. 
Here police are using a national DNA 
database along with other kinds of surveillance 
data, such as from video cameras and facial 
scanners, to monitor the minority Muslim 
Uyghur population in the western province 
of Xinjiang. 

Concerns about the potential downsides of 
governments being able to interrogate people’s 
DNA have been voiced since the early 2000s 
(ref. 1) by activist groups, suchas the non-profit 
organization GeneWatch UK, and some genet- 
icists (myself included). Partly thanks to such 
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debate, legislation and best practices have 
emerged in many countries around the use of 
DNA profiling inlaw enforcement. (In profiling, 
several regions across the genome, each con- 
sisting of tens of nucleotides, are sequenced 
to identify a person or their relatives.) 

Now the stakes are higher for two reasons. 
First, as technology gets cheaper, many 
countries might want to build massive DNA 
databases. Second, DNA-profiling technology 
can be used in conjunction with other tools for 
biometric identification — and alongside the 
analysis of many other types of personal data, 
including an individual’s posting behaviour 
on social networks. Last year, the Chinese 
firm Forensic Genomics International (FGI) 
announced that it was storing the DNA pro- 
files of more than 100,000 people from across 
China (FGI, known as Shenzhen Huada Forensic 
Technology in China, is asubsidiary of the BGI, 
the world’s largest genome-research organiza- 
tion). It made the information available to the 
individuals through WeChat, China’s equiva- 
lent of WhatsApp, using an app accessed by 
facial recognition. 

With stringent safeguards and oversight, it 
is legitimate for law-enforcement agencies to 
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SOURCE: MIT TECH. REV. 2019 (HTTPS://GO.NATURE.COM/2MHTJED) 


use DNA-profiling technology. But these uses 
can easily creep towards human-rights abuses. 
In October this year, the US Department of 
Homeland Security announced that it would 
authorize the mandatory collection of DNA 
samples from immigrants in federal custody 
at the US border, including children and those 
applying for asylum at legal ports of entry. 
The resulting DNA profiles will be available 
through a database called CODIS (Combined 
DNA Index System), which includes the pro- 
files of convicted offenders and individuals 
arrested for serious offences. Suchtreatment 
could reinforce debunked claims that immi- 
grants are more prone to criminal behaviour 
than the general population. 

Amuch broader array of stakeholders must 
engage with the problems that DNA data- 
bases present. In particular, governments, 
policymakers and legislators should tighten 
regulation and reduce the likelihood of corpo- 
rations aiding potential human-rights abuses 
by selling DNA-profiling technology to bad 
actors — knowingly or negligently. Research- 
ers working on biometric identification tech- 
nologies should consider more deeply how 
their inventions could be used. And editors, 
reviewers and publishers must do more to 
ensure that published research on biometric 
identification has been done in an ethical way. 


Government monitoring 


In Xinjiang in China, police collected biometric 
information (including blood samples, finger- 
prints and eye scans) from nearly 19 million 
people in 2017, ina programme called ‘Physicals 
for All’. This was part of a suite of measures that 
are being used by the Chinese government to 
control the Uyghur ethnic group’. 

Other nations are building massive DNA 
databases or considering doing so. In 2015, 
Kuwait passed alaw mandating DNA profiling 
of its entire population. Foreigners living in 
Kuwait and even visitors were to be included. 
In January this year, Kenya passed a law that 
would have enabled the government to require 
all citizens to submit any biometric informa- 
tion, including DNA profiles, to a national 
database. 

Both cases have hit obstacles. Kuwait’s 
Constitutional Court overruled the 2015 law 
two years later, because of concerns about 
how the database could be used in violations 
of privacy and due process. And, thanks toa 
decision taken by Kenya’s High Court in April, 
DNA is now excluded from national efforts to 
collect biometric data. 

But these and other examples indicate that 
governments keep being tempted to hoover 
up their citizens’ DNA data’. 


Corporate responsibility 

One way to reduce the likelihood of massive 
DNA databases being misused is to change 
the behaviour of the companies that invest 


DNA TESTING FOR ALL 


An increasing number of people are having their DNA 
analysed by consumer-genomics companies. 
@Ancestry M23andMe_ M Others 


30 


Relatedness means 
that the genetic 
privacy of untested 
people is at risk now 
that firms hold DNA 

# data for ~5% of the 
US population. 


People tested (millions) 
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in DNA-profiling technologies (see ‘Ethical 
divesting’). 

US and European corporations are still 
the dominant providers of such technolo- 
gies. The deployment of DNA-surveillance 
infrastructure in Xinjiang, for example, 
was enabled by the Chinese government 
buying products from — and working with 
— the US company Thermo Fisher Scien- 
tific in Waltham, Massachusetts. The firm 
is currently the global leading supplier of 
DNA-profiling technology in law enforce- 
ment. Thermo Fisher Scientific researchers 
have worked with China’s Ministry of Justice, 
and with researchers at the People’s Public 
Security University of China, which falls 
directly under the Ministry of Public Secu- 
rity, to tailor the technology specifically for 


“Governments keep being 
tempted to hoover up their 
citizens’ DNA.” 


use in Tibetan and Uyghur populations». 
(Thermo Fisher Scientific did not respond toa 
request for comment). However, in February, 
after two years of public outcry and intense 
pressure from high-profile US senators, the 
company announced that it would stop sell- 
ing its DNA-profiling technology in Xinjiang. 

Marketing and lobbying by technology sup- 
pliers is often behind pushes for the broadest 
possible use of DNA profiling. In 2016, for 
instance, a representative of a US lobbying 
firm working for Thermo Fisher Scientific 
described in a conference presentation the 
development of universal DNA databases as 
“inevitable”. He noted that the expansion of 
these to “Western countries or other coun- 
tries with democratic forms of government” 
faced “significant hurdles”, suchas the “open 
and public parliamentary process” and the 
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“culture of being influenced by opposition 
and protests” (see go.nature.com/337pjce). 

Restrictions on the use of technologies 
or services provided by corporations are 
currently too weak. Take export controls: 
either they do not pay due attention to 
these sensitive technologies, or they have 
loopholes that often render them useless. 
For example, US laws forbid the export of 
fingerprint-recognition technology to some 
destinations or users deemed problematic 
by the US government, such as the Chinese 
police. But the United States does not restrict 
the export of more-invasive DNA-profiling 
and facial-recognition technologies. Mean- 
while, the European Union does not regulate 
the export of fingerprint technology, even 
though the dominant global suppliers are 
European. 

Export controls for biometric technologies 
could be improved relatively easily. The US 
Department of Commerce is currently con- 
sidering revising regulations for emerging 
technologies®, such as Internet censorship 
and video surveillance, to try to reduce the 
likelihood of companies doing business with 
problematic buyers. Last month, it barred 
Xinjiang police forces and eight Chinese 
technology companies from buying US prod- 
ucts or importing US technology because of 
their role in the repression of Uyghurs. 

Some regulatory initiatives are promising 
and could provide a deterrent if enforced. 
The 2017 EU directive on non-financial 
reporting (named 2014/95) has mandated 
that large companies listed on stock markets 
document their social and environmental 
impacts in their annual reports for share- 
holders and the public. Since 2017, France’s 
corporate ‘duty of vigilance’ law has required 
all French companies employing more than 
5,000 people in the country to actively 
monitor their impacts on human rights, 
the environment and so on (see go.nature. 
com/2o08tcvn). 

In the United States, several human-rights 
lawyers have attempted to revive the Alien 
Tort Statute (28 U.S.C. § 1350) over the past 
20 years. Produced in1789 but never deployed, 
this law could enable a foreign individual to 
make a civilliability claim against a domestic 
corporation in US courts. A carefully crafted 
Alien Tort Statute could provide a way to hold 
companies to the same standards, whether 
they are operating at home or abroad. 

Ultimately, international laws must be 
established that clearly stipulate the human- 
rights responsibilities of corporations. For 
the past decade, a United Nations working 
group has been drafting a treaty to regulate 
the activities of transnational corporations 
with regards to human rights and the envi- 
ronment (see go.nature.com/35qnehe). If it 
is not crippled by lobbying, this could even- 
tually become a powerful tool to promote 
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ethical business practices. Yet companies are 
only part of the story when it comes to the 
potential misuse of DNA databases. 


Research ethics 


The chain of technology development leads 
from fundamental to applied research to the 
products that enable the abuses. More aca- 
demics working on biometric identification 
technology should reflect on the potential 
misuses of their inventions and engage with 
society. For instance they can contribute 
to mainstream media, participate in public 
debates or join ethics boards. 

Recent events indicate that publishers and 
scholars might be paying insufficient atten- 
tion tothe sources of biometric-identification 
research. For example, in August last year, 
after several Human Rights Watch and media 
reports about the surveillance abuses in 
Xinjiang, Springer Nature published the pro- 
ceedings of a biometrics conference held in 
the province. (Springer Nature has been the 
publisher of the proceedings of the Chinese 
Conference on Biometric Recognition for 
nine years; Nature is editorially independ- 
ent of its publisher.) One of the conference 
papers, on technologies for recognizing 
various languages in images, described how 
“Uyghur information’ (referring to the Uyghur 
language script) could be detected in images 
that might be used to evade Internet censor- 
ship’. Another paper described how products 
from Thermo Fisher Scientific and the Chinese 
firms Hisign, Megvii and iFlytek are being used 
to build a population-scale database for DNA, 
fingerprint, face and voice information ina 
major Chinese city®. 

In July this year, researchers from Imperial 
College London announced the results of an 
open competition on facial recognition. (The 
winners presented their work at aconference 
in Seoul in October.) Before a reporter from 
the non-profit news platform Coda pointed 
it out, one of the sponsors of the conference 
had been a Chinese artificial-intelligence 
start-up called DeepGlint, which in 2018 set 
up ajoint research laboratory with the Xinjiang 
police. The conference organizers removed 
DeepGlint as a sponsor in August. 

Over the past eight years, three leading foren- 
sic genetics journals — International/ournalof 
Legal Medicine (published by Springer Nature), 
and Forensic Science International and Foren- 
sic Science International: Genetics Supplement 
Series (both published by Elsevier) — have pub- 
lished 40 articles co-authored by members 
of the Chinese police that describe the DNA 
profiling of Tibetans and Muslim minorities, 
including people from Xinjiang. I analysed 
529 articles on forensic population genetics in 
Chinese populations, published between 2011 
and 2018 in these journals and others. By my 
count, Uyghurs and Tibetans are 30-40 times 
more frequently studied than are people from 


38 | Nature | Vol576 | 5 December 2019 


ETHICAL 
DIVESTING 


Investors could help to ensure ethical use 
of the products of DNA profiling firms. 


Public outcry can lead to divestment. 
Since March this year, for example, major 
US funds such as Goldman Sachs have 
divested all their shares from the Chinese 
surveillance company Hikvision, because 
of concerns about the use of the company’s 
products in human-rights breaches. 
Investors could even be motivated to 
scrutinize company ethics, thanks to studies 
over the past five years or so indicating 
that ‘good’ corporate social responsibility 
practices tend to correlate with better 
financial performance over the long term. 
Pressure from investors — and the 
public in general — might be increasingly 
powerful. Take Thermo Fisher Scientific’s 
February announcement that it would 
stop selling its DNA profiling technology 
in Xinjiang, China. Although Chinese 
authorities can easily transport such 
technology from elsewhere in the country, 
it is significant that a major corporation 
publicly acknowledged “the importance of 
considering how [its] products and services 
are used — or may be used — by [its] 
customers”. Y.M. 


Han communities, relative to the size of their 
populations (unpublished data). Half of the 
studies in my analysis had authors from the 
police force, military or judiciary. The involve- 
ment of such interests should raise red flags to 
reviewers and editors. 

In short, the scientific community in 
general — and publishers in particular — need 
to unequivocally affirm that the Declaration 
of Helsinki (a set of ethical principles regard- 
ing human experimentation, developed for 
the medical community) applies to all biom- 
etric identification research (see go.nature. 
com/34bypbf). Unethical work that has been 
published in this terrain must be retracted. 


Privacy concerns 


DNA databases in local police forces are 
proliferating, even in countries that have 
democratic governments and well-estab- 
lished legal protections for citizens’ privacy’. 
By August this year, for instance, the Office 
of the Chief Medical Examiner of New York 
City held more than 82,000 genetic profiles. 
At the same time, there has been a growth 
in consumer and recreational genomic ser- 
vices, such as the US corporations 23andMe 
in Mountain View, California, and Ancestry in 
Lehi, Utah (see ‘DNA testing for all’). Medical 
DNA sequencing is also becoming routine’. 
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Currently, only some consumer-genomics 
companies have willingly shared people’s DNA 
data with law-enforcement agencies. And in 
many countries, patients’ data are confidential. 

But to deploy DNA surveillance across a 
group of people, you need profiles from only 
2-5% of that population, because biological 
relationships can be inferred". And as gene- 
alogy and medical databases mushroom, law 
enforcers and others are increasingly tempted 
to tap into them”. In 2017 in the Netherlands, 
the Ministry of Health drafted a bill that would 
have allowed police to obtain people’s DNA 
information from hospitals in some limited 
cases. It was abandoned following public 
outcry. 

And June saw what might bea game changer 
in the United States. The Orlando Police 
Department obtained a warrant that allowed 
it to search the entire DNA database of the 
GEDMatch genealogy website, based in Lake 
Worth, Florida. Because consumer-genomics 
companies already hold DNA data for an esti- 
mated 5% of the US population, unfettered 
access to these data by law-enforcement 
agencies would simply spell the end of genetic 
privacy inthe United States. 

All of us must beware a world in which our 
behavioural, financial and biometric data, 
including our DNA profiles, or even entire 
genome sequences, are available to corpora- 
tions — and so potentially to law enforcers and 
political parties. Without the changes outlined 
here, the use of DNA for state-level surveillance 
could become the norm in many countries. 
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Readers respond 


Correspondence 


Joint statement 
on EPA proposed 
rule and public 
availability of data 
(2019) 


Eighteen months after 
articulating our concerns 

(J. Berg et al. Nature http:// 
doi.org/crq8; 2018) regarding 
the 2018 ‘Strengthening 
Transparency in Regulatory 
Science’ rule proposed by the 
US Environmental Protection 
Agency (EPA; go.nature. 
com/2kmtd7g), we have become 
more concerned in response 

to recent media coverage and 
a13 November hearing onthe 
role of science in decision- 
making at the EPA. These events 
suggest that the proposed 

rule is now moving towards 
implementation; whether it 
includes amendments sufficient 
to address the concerns raised 
by us and many others remains a 
question. 

Our previous statement on 
the proposed rule, authored 
and published by the editors- 
in-chief of five major scientific 
journals in May 2018, reflected 
alarm that the proposal’s 
push for ‘transparency’ would 
be used as a mechanism 
for suppressing the use of 
relevant scientific evidence 
in policy-making, including 
public-health regulations. 
After the public comment 
period for the proposed rule 
closed, the EPA reported more 
than 590,000 comments from 
individuals and scientific, 
medical and legal groups, 
many of which articulated 
similar concerns (see go.nature. 
com/2jfxhhn). 

As leaders of peer-reviewed 
journals, we support open 
sharing of research data, but 
we also recognize the validity 
of scientific studies that, 
for confidentiality reasons, 
cannot indiscriminately 
share absolutely all data. 


Data sets featuring personal 
identifiers — including 

studies evaluating genomes 

of thousands of people to 
characterize medically relevant 
genetic variants — are but 

one example. Such data may 
be critical to developing new 
drugs or diagnostic tools, but 
cannot be shared openly; even 
anonymized personal data can 
be subject to re-identification, 
and it has been a long-standing 
practice for agencies and 
journals to acknowledge 

the value of data-privacy 
adjustments. The principles 
of careful data management, 
as they inform medicine, 

are just as applicable to data 
regarding environmental 
influences on public health. 
Discounting evidence from 
the decision-making process 
on the basis that some data are 
confidential runs counter to the 
EPA stated mission “to reduce 
environmental risks ... based 
onthe best available scientific 
information” (see go.nature. 
com/2kqheny). 

We are also concerned 
about how the agency plans 
to consider options related 
to existing regulations. 

Even if anew standard is not 
applied retroactively, the 
standard could apply when a 
regulation is updated; thus, 
foundational science from years 
past — research on air quality 
and asthma, for example, 

or water quality and human 
health — could be deemed by 
the EPA to be insufficient for 
informing our most significant 
public-health issues. That would 
bea catastrophe. 

We urge the EPA to continue to 
adopt an approach that ensures 
the data used in decision- 
making are the best available, 
which will at times require 
consideration of peer-reviewed 
scientific data, not all of which 
may be open to all members of 
the public. The most relevant 
science, vetted through peer 
review, should inform public 


policy. Anything less will harm 
decision-making that claims to 
protect our health. 

We hope that in the end, 
decisions that are made to 
inform the proposed EPA rule 
will rise above any form of 
politics, focusing on what’s 
best for our communities. 

We encourage anyone with 
concerns or opinions about 

this issue to express their views 
through relevant legislative 
channels. Whether submitting 
public comments to the EPA or 
communicating with lawmakers 
in Congress, itis important to 
emphasize that decision-making 
that affects us all should be 
informed by nothing less than 
the full suite of relevant science 
vetted through peer review. 


H. Holden Thorp Science family 
of journals, Washington DC, USA. 
hthorp@aaas.org 


Magdalena Skipper Nature, 
London, UK. 


Veronique Kiermer Public Library 
of Science (PLoS) journals, San 
Francisco, California, USA. 


May Berenbaum Proceedings of 
the National Academy of Sciences, 
Washington DC, USA. 


Deborah Sweet Cell Press, 
Cambridge, Massachusetts, USA. 


Richard Horton The Lancet, 
London, UK. 


Editor's note: This statement was 
published online on 26 November, 
and simultaneously as a letter in 
Science (H. Holden Thorp et al. 
Science https://doi.org/10.1126/ 
science.aba3197; 2019), which 
should be the primary citation. 

It is being disseminated by other 
publications represented by the 
signatories. 


Boost glacier 
monitoring 


Glacier-mass changes area 
reliable indicator of climate 
change. On behalf of the 
worldwide network of glacier 
observers, we urge parties to 
the United Nations Framework 
Convention on Climate 
Change to boost international 
cooperation in monitoring these 
changes, and to include the 
results in the Paris agreement’s 
global stocktake. 

Since 1960, glaciers have lost 
more than 9,000 gigatonnes of 
ice worldwide — the equivalent 
of a20-metre-thick layer with 
the area of Spain. This melting 
alone — as distinct from that of 
the Greenland and Antarctic ice 
sheets — has raised global sea 
level by almost 3 centimetres, 
contributing 25-30% of the total 
rise (M. Zemp etal. Nature 568, 
382-386; 2019). 

The present rate of melting 
is unprecedented. Several 
mountain ranges are likely to 
lose most of their glaciers this 
century. And we face the loss 
of almost all glaciers by 2300 
(B. Marzeion et al. Cryosph. 6, 
1295-1322; 2012). 

Glacier shrinkage will severely 
affect freshwater availability 
and increase the risk of local 
geohazards. Global sea-level rise 
will result in the displacement 
of millions of people in coastal 
regions and inthe loss of 
life, livelihoods and cultural- 
heritage sites. 

The systematic monitoring of 
glaciers has been internationally 
coordinated for 125 years. 
Continuing to do so will 
document progress in limiting 
climate change for current and 
future generations. 


Michael Zemp* World Glacier 
Monitoring Service, University of 
Zurich, Switzerland. 
michael.zemp@geo.uzh.ch 

* On behalf of 38 co-signatories; 
see go.nature.com/34ak25y 
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Expert insight into current research 


News & views 


Metallurgy 


Fine-grained metals 
from 3D printing 


Amy J. Clarke 


Conventional alloys have undesirably coarse-grained 
microstructures when used in 3D printing. A designer alloy 
overcomes this problem, potentially opening the way to the 
widespread adoption of 3D metal printing. See p.91 


There are many potential benefits to using 
additive manufacturing — also known as 3D 
printing — for making metal parts, rather than 
conventional manufacturing processes. For 
example, additive manufacturing is highly cus- 
tomizable, it can produce complex structures 
and it can be used for the economical produc- 
tion of lownumbers of metal components. But 
toachieve the strict specifications needed for 
some applications, the microscopic structure 
of printed metal objects must be controlled. 
On page 91, Zhang et al.' describe titanium- 
copper alloys that produce practically useful 
microscopic structures during additive man- 
ufacturing, removing the need for subsequent 
treatment. The resulting materials exhibit 
promising combinations of mechanical prop- 
erties, comparable to those of the ubiquitous 
structural alloy Ti-6Al-4V, produced using 
conventional and additive manufacturing 
processes. 

In metal additive manufacturing, an alloy 
(in the form of powders or wires) is deposited 
inalayer and then melted by arapidly moving 
heat source to form a solid mass; successive 
layers are built up to produce a 3D part. The 
process typically produces large tempera- 
ture gradients, high solidification rates and 
repeated cycles of heating and cooling. A 
common characteristic of 3D-printed metals 
is coarse columnar grains that grow along spe- 
cific directions of the crystal lattice that are 
favourably oriented with the heat flow (Fig. 1a). 

Coarse columnar grains are usually 
undesirable because they can cause the 
printed material to have direction-dependent 
(anisotropic) mechanical properties and make 
it susceptible to tearing or cracking during 
solidification? *. However, columnar solidi- 
fication can undergo a transition to equiaxed 
solidification — in which the grains produced 


have similar dimensions in all directions — by 
changing the processing conditions used for 
additive manufacturing’. Alloys with equiaxed 
grains have desirably uniform properties, and 
so methods for producing them are of great 
technological value’. 

Models and experiments have been used 
to study the columnar-to-equiaxed transi- 
tion (CET) in nickel-based alloys that have 
been melted using an electron beam??. The 
number of nuclei (tiny crystals that ‘seed’ 
the growth of the solid phase) in the liquid 
metal, and the processing conditions used 
during electron-beam additive manufactur- 
ing, were found to have a larger influence on 


grain structure than did the composition of 
the alloy*®. This suggests that the CET can be 
controlled through process design and by 
promoting nucleus formation in alloy melts. 
Additives called inoculants, which cause nuclei 
to form in the melt, have been incorporated 
into metal-alloy powders used in additive man- 
ufacturing, to increase the density of nuclei 
and thereby promote the formation of equi- 
axed grains*. However, suitableinoculants for 
titanium alloys remain elusive. 

Zhang et al. now show that fine equiaxed 
grains, on average less than 10 micrometres 
in diameter, can be produced in titanium— 
copper alloys during additive manufactur- 
ing, without adding inoculants (Fig. 1b). The 
authors propose that nucleation and CET 
are promoted in these alloys by the formation 
of alarge zone of supercooled liquid — melted 
alloy that is fully liquid, despite its being below 
the temperature at which the alloy should start 
to solidify. The final product consists of two 
solid phases that contain different amounts 
of titanium and copper, forming a microstruc- 
ture that includes nanoscale plates (lamellae). 
The mechanical properties of the printed 
material compare favourably with those 
of Ti-6Al-4V, and of cast (and heat-treated) 
titanium-copper alloys. 

The authors suggest that equiaxed grains 
are produced during solidification of the 
melt, and that further microstructural refine- 
ment might then occur during the cyclical 


Figure 1| Grain structure in printed metals. a, When conventional metal alloys are used for 3D printing, 
large columnar grains tend to form, as shown here for the structural alloy Ti-6Al-4V. This causes the 
printed alloy to have undesirable anisotropic (direction-dependent) properties. b, Zhang etal.’ report that 
titanium-copper alloys produced by 3D printing contain fine grains that have similar dimensions in all 
directions. The alloy shown here was produced using the same conditions as in a. (Images from ref. 1.) 
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temperature changes associated with the 
3D-printing process. However, it is difficult 
to tell unambiguously whether the solidifi- 
cation step is the genesis of the fine grains, 
because the microstructures produced at 
high temperatures during solidification will 
be replaced by features that develop during 
subsequent solid-state phase transitions. 
Another plausible scenario is that columnar 
grains form during solidification, and that 
equiaxed grains are produced and refined 
during solid-state thermal cycling. Such grain 
refinement has been reported in steels>. 

When steels that have a two-phase lamellar 
microstructure at low temperatures are heated 
above a critical temperature, new grains of a 
third phase (austenite) nucleate and grow. The 
two low-temperature phases then re-formon 
cooling’. Repeated nucleation and growth of 
the various phases can therefore occur under 
suitable conditions during thermal cycling, 
leading to significant grain refinement. 

Alloys such as Ti-6Al-4V typically do not 
undergo grain refinement during thermal 
cycling’, because no new grains of the 
high-temperature phase nucleate. However, it 
is unclear whether new grains of high-temper- 
ature phase can nucleate and growin Ti-6Al-4V 
during thermal cycling typical of additive 
manufacturing’, which might conceivably 
refine grains. Zhang and colleagues’ titanium-— 
copper alloys have high- and low-temperature 
phases analogous to those of steels. Clarify- 
ing the role of nucleation and growth of these 
phases in grain refinement during thermal 
cycling should bea topic of future research. 

A deeper understanding of solidification 
and solid-state phase transitions is clearly 
needed to guide the design of future alloys for 
additive manufacturing and to control their 
microstructures — although the nucleation 
stage is hard to study experimentally. Itis also 
imperative that we have a better understand- 
ing of how the rapidly changing conditions 
during additive manufacturing influence 
microstructure development. /n situ charac- 
terization of phase transitions and dynamic 
phenomena, for example using imaging and 
diffraction techniques in experiments that 
simulate the conditions of additive manufac- 
turing®’, might help to unveil some of the com- 
plexity of the processes involved. Such efforts 
are timely, and are necessary to produce opti- 
mized alloys that will lead to the widespread 
adoption of additive manufacturing for the 
production of high-performance structural 
parts, for which reliably high-quality micro- 
structures and mechanical properties are of 
the utmost importance. 


Amy J. Clarke is in the George S. Ansell 
Department of Metallurgical and Materials 
Engineering, Colorado School of Mines, 
Golden, Colorado 80401, USA. 

e-mail: amyclarke@mines.edu 
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The fruit fly gets oriented 


Malcolm G. Campbell & Lisa M. Giocomo 


Two studies in flies reveal the mechanism by which the 
brain’s directional system learns to align information about 
self-orientation with environmental landmarks — a process 
crucial for accurate navigation. See p.121 & p.126 


As everyone knows, a good sense of direction 
is needed to successfully navigate the world. 
In mammals, this ‘sense’ involves neurons 
called head-direction cells. Each such cell 
becomes most active when the animal faces a 
particular direction relative to landmarks in 
its environment. Together, the cells’ activity 
indicates which direction the animal is facing 
in at any given moment. In 2015, it emerged 
that fruit flies, which are much easier than 
mammals to study experimentally, have 
strikingly similar cells, called heading neu- 
rons’. Fisher et al.” (page 121) and Kim et al.? 
(page 126) now build on this discovery to tackle 
a decades-old problem: how does this type of 
neuron respond tothe locations of landmarks 


“This shows that the fly’s 
heading network can store 
and retrieve memories 

of scenes.” 


inamanner that is stable enough to be reliable, 
but flexible enough to allow adaptation to new 
environments? 

To give an example of the problem, imagine 
emerging from a subway station onto a 
crowded street. If you are a regular visitor, 
a glance around is all you need to be on your 
way. However, if you have never been to this 
station before, you might need a moment to 
orient yourself. You take note of surrounding 
street signs, shops and monuments. Before 
long, you have your bearings and can set off 
inthe right direction. 

This example highlights two challenges 
for the brain’s directional system. First, it 
must stably indicate direction in familiar 
environments: returning to the same station 
should call the same orientation to mind. 
Second, it must have the flexibility to learn 
new configurations of landmarks, even when 
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similar landmarks have been seen before — the 
particular configuration of street signs at 
the new station must be learnt, even though 
you may have seen similar street signs in 
other places. 

The neural mechanisms that underlie these 
abilities in flies are a beautiful example of 
form following function. The insects’ head- 
ing neurons (also knownas E-PG, or compass, 
neurons) are arranged in a ring (Fig. 1) that 
corresponds to the 360° of possible direc- 
tions in which the fly can face!, sometimes 
called heading angles. Because of inhibition 
between neurons, only one heading angle 
can be indicated at one time, providing the 
fly with an unambiguous signal. Of note, rather 
than always aligning their activity to a cardi- 
nal direction such as north, heading neurons 
realign their activity arbitrarily when the fly 
enters a new environment. The heading neu- 
rons receive input from visual ring neurons, 
which are activated by visual cues at partic- 
ular orientations relative to the fly, and from 
internal cues about self-motion. 

Fisher etal. set out to test whether and how 
the connections between visual ring neurons 
and heading neurons change with experience, 
using a range of experimental techniques 
(many of whichare possible only in fruit flies). 
They implemented a virtual-reality (VR) sys- 
tem in which the fly walked on a floating ball. 
Anarray of lights around the fly flashed on and 
off in concert with the animal’s movements‘, 
providing visual cues to enable the fly to ori- 
ent itself. The authors then measured inputs 
from visual ring neurons to heading neurons 
as the flies explored this virtual environment. 
They also used genetic techniques to inhibit 
the activity of visual ring neurons. 

These experiments revealed that individual 
heading neurons are inhibited by visual ring 
neurons that are activated by visual cues at 
specific angles relative to the fly. Because of 
the specificity of this pairing, the visual input 


TANYA WOLFF 


Figure 1| Neurons in the central complex of the fruit-fly brain, tagged with fluorescent proteins. The 
central complex includes a ring-like structure called the ellipsoid body that contains heading neurons. These 
cells correspond to all the possible directions in which the fly can face, providing the insect with a compass- 
like signal that it uses to navigate. Two studies”? have revealed how flies orient themselves in familiar 
environments and adapt to new ones, thanks to signalling to heading neurons from visual ring neurons, 
which originate in the eyes (not shown). 


reinforces the directional preference of the 
heading neurons. This work solves the prob- 
lem of howthe brain can transform visual input 
into a stable directional signal in a familiar 
environment — the first of the challenges in 
our subway scenario. 

Next, Fisher et al. tested how heading 
neurons can adapt when their environment 
changes. They presented flies with two 
identical visual cues, separated by 180° — an 
ambiguous environment in which a half turn 
produces the same visual cue as a full turn. 
The flies’ heading neurons, which can repre- 
sent only one heading angle at atime, flipped 
between being preferentially activated by two 
opposing heading orientations. 

After the flies were returned to the one- 
cue world, the relationship between visual 
input and the activity of the heading network 
as a whole sometimes changed by 180°. The 
strength of visual inputs to heading neurons 
also changed, but only in neurons that were 
active during the two-cue period. 

This finding shows that new associations can 
form between visual ring neurons and head- 
ing neurons in new environments. However, 


simple visual changes are not enough. Instead, 
there must bea coordinated activation of the 
upstream visual ring neuron and downstream 
heading neuron. This leads to a decrease in 
the strength of the inhibitory synaptic con- 
nection between them, so that the heading 
neuron becomes less sensitive to inhibition by 
the visual ring neuron —a phenomenon known 
as associative plasticity. 

In a complementary experiment, Kim 
et al. presented flies with VR scenes derived 
from natural images, moving a step closer to 
naturalistic conditions. They then stimulated 
heading neurons in arbitrary orientations 
relative to the visual cues the fly was receiving, 
thereby altering neurons’ preferred heading 
directions. After this stimulation period, the 
offsets between heading-neuron activity and 
visual input remained intact, demonstrat- 
ing the capacity of the system to learn new 
visual-heading associations. Even partial 
views of a scene, when paired with stimula- 
tion, caused global changes in the activity of 
the heading-neuron network. This reveals a 
useful property of the network for our sub- 
way set-up: it enables you to orient yourself 
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at a new Station without having to survey all 
360° of the scene. 

But the system’s flexibility could have a 
downside — if synapses can change, can they 
also be erased? Kim et al. asked whether the 
heading network can ‘remember’ multiple 
scenes. First, they found that presenting 
flies with different scenes elicited different 
heading-neuron direction preferences, which 
varied from fly to fly. But, crucially, these pref- 
erences were stable for a given scene for each 
fly, even when the scene was presented as part 
of a ‘slide show’ of multiple different scenes. 
This shows that the fly’s heading network can 
store and retrieve memories of scenes. The 
authors conclude their paper by developing 
theories that predict what types of scene can 
be simultaneously stored and what kinds of 
rule allowscenes to be learnt without existing 
memories being erased. 

Together, these studies rigorously establish 
the ability of the fly’s heading network to learn 
through associative plasticity. Future work 
should explore the memory capacity of the 
system. A key question is whether flies and 
other insects use memories of complex scenes 
for navigation, or rely more heavily on celestial 
cues such as the Sun’. Other types of sensory 
input, suchas light polarization, also probably 
have arolein anchoring insect heading repre- 
sentations, and need to be taken into account. 
In addition, molecular and cellular work will 
be needed to uncover the synaptic-plasticity 
rules at work in the system and to determine 
whether they match Kim and colleagues’ 
theoretical prediction. Finally, this work gener- 
ates hypotheses that should be tested in other 
species, because many properties of the fruit 
fly’s heading neurons are similar to those of 
mammalian head-direction cells. 

So, although it might not have mastered 
the subway, the fruit fly has deepened our 
understanding of the neural mechanisms 
that underlie our sense of direction. A rich 
landscape of further research awaits. 


Malcolm G. Campbell and Lisa M. Giocomo 
are in the Stanford University School of 
Medicine, Stanford, California 94305, USA. 
M.G.C. is also in the Department of Molecular 
and Cellular Biology, Harvard University, 
Cambridge, Massachusetts. 

e-mails: mgcampb@fas.harvard.edu; 
giocomo@stanford.edu 
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Allears about 


ancient mammals 


Anne Weil 


The configuration of middle-ear bones in an ancient fossil 
suggests that specializations suited to eating plants might 
have influenced how the jaw joint evolved to form the 


mammal’s ear. See p.102 


The presence of three delicate bones in the 
middle ear that are completely separated from 
the lower jaw can be used to distinguish exist- 
ing mammals from other vertebrates. This 
arrangement evolved independently at least 
three times in mammals, soit is not found inall 
mammalian fossils. On page 102, Wang et al. 
describe a newly discovered fossil that reveals 
how these different middle ears evolved into 
distinct configurations. 

The authors named this previously unknown 
species /eholbaatar kielanae. It was about the 
size of a vole, and scampered around China 
about 120 million years ago. It belonged to the 
longest-lived mammalian lineage, the multi- 
tuberculates. These typically small-bodied 
mammals persisted from about 160 million 
to 34 million years ago, and diverse members 
of this lineage became common throughout 
the Northern Hemisphere. 

Multituberculates might have been so 
successful because they chewed differently 
from other mammals. Instead of slicing food 
into pieces using a vertical biting motion like 
a cat does, or grinding their food by moving 
their lower jaw (the mandible) horizontally and 
sideways like a cow, multituberculates sliced 
and ground food by drawing their mandible 
horizontally but backwards. This innovation, 
‘palinal motion’, required specializations 
of the teeth, jaw joint and musculature. It 
contributed to the unmatched longevity of 
the multituberculate lineage, and it facilitated 
group diversification by enabling multi- 
tuberculates to use plants as a food source 
at a time in prehistory when other mammals 
mainly ate insects or small vertebrates. 

Wang and colleagues argue that the adapta- 
tion of this chewing approach also drove the 
evolution of an unusual type of ear. In each 
independent instance, mammalian middle 
ears evolved from an ancestral jaw joint. In 
every case, the articular bone at the back of 
the mandible and the quadrate bone (which 
became the incus bone of the middle ear) that 
it made contact with onthe skull retained their 
connection. These bones shifted slightly 
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internally to form a middle ear together with 
a bone called the stapes, which was present 
in mammalian ancestors. Other bones then 
formed the jawjoint that mammals have today. 
In transitional stages of this evolutionary 
process, the connection between the middle 
ear and the mandible was still present at a 
middle-ear bone called the malleus, although 
the extent of this connection was reduced 
compared with the connection in the ancestral 
state’®. Boththejawandtheearhadtofunctionat 


a 
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Mammalia 
| Arboroharamiya 
<< Jeholbaatar 
Didelphis 
b Arboroharamiya 


Ornithorhynchus 


Pe Jeholbaatar 
Mammalia Didelphis 
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all stages of the transition. If multituberculates 
had adopted palinal chewing before the sep- 
aration of the middle-ear bones from the jaw, 
how would this arrangement have worked? 
The tiny but exquisitely preserved middle ear 
of Jeholbaatar (Fig. 1) is completely separated 
from the jaw, but it provides the beginning of 
an answer to this question. 

It has long been suspected that, in mam- 
malian ancestors, the articular bone and the 
prearticular bone of the ancestral jaw fused 
to form the malleus. Fossil discoveries have 
suggested that a third bone, the surangular, 
also fused with the articular, at least in some 
lineages**. InJeholbaatar, the surangular is 
present as a separate bone distinguishable 
along the lateral side of the malleus. The only 
other animal in which a separate surangular 
has been described in the ear also shares a 
second oddtrait with Jeholbaatar*: the position 
of the incus in the middle ear. 

The incus lies flat on top of the malleus 
in Jeholbaatar, in contrast to its position in 
humans and opossums (Didelphis), in which 
it is positioned posteriorly, behind the 
malleus. This contact between the incus and 
the malleus in Jeholbaatar, horizontal and 
parallel to the plane in which the teeth would 
have met, is what we would expect to see if 
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Figure 1 | The evolution of mammalian middle ears. Wang et al.' report the discovery of a fossil of a 
previously unknown mammalian species, Jeholbaatar kielanae. Its middle ear is similar to that of an extinct 
animal called Arboroharamiya. a, This similarity might indicate that /eholbaatar and Arboroharamiya should 
be grouped close together on a mammalian family tree, and suggests that the ‘palinal’ chewing motion 

used by /eholbaatar and Arboroharamiya has a single origin in a shared ancestor. Also shown in this tree are 
playtpuses (Ornithorynchus) and opossums (Didelphis), mammals that don’t use palinal chewing and that 
have middle-ear configurations that are distinct from each other and from/Jeholbaatar and Arboroharamiya. 
b, However, there is some debate about whether Arboroharamiya were mammals. If not, as in this tree, then 
the similar middle ears of Jeholbaatar and Arboroharamiya evolved independently. c, The configurations 

of the left middle-ear bones of these four creatures are presented as viewed directly from above, with the 
animal's front to the right. The different configurations of the incus, malleus and surangular bones might 
reflect the evolution of jaw specializations before bones separated from the jaw to form the ear. (Images 


based on ref. 1 and not shown to scale.) 
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palinal chewing had evolved before the middle 
ear was separate from the jaw’. 

During transitional evolutionary stages, 
when the malleus was connected to the 
mandible, palinal jaw movement would have 
constrained the plane in which the malleus 
and incus could have been in contact; had 
the incus been in the more familiar posterior 
position found in most mammals today, it 
would have acted as a stop on backward jaw 
motion. Once palinal motion for chewing 
was established, increasing the distance the 
lower jaw moved forwards and backwards on 
the jaw joint would have made chewing more 
efficient. Any remaining tether to the ear 
would have limited the distance that the lower 
jaw could travel in a single chew, so selection 
pressure fora fully separate ear and jaw would 
have beenstrong, and fullseparationcould have 
evolved rapidly. 

The other animal knownto havea surangular 
in the ear is Arboroharamiya, a member of 
an ancient group known as euharamiyidans 
with a palinal element to its chewing and an 
earlier origin than that of multituberculates*». 
Arboroharamiya, like Jeholbaatar, has its 
incus positioned above the malleus**. The 
relationship between euharamiyidans and 
multituberculates on the evolutionary tree is 
a matter of lively debate, with some studies, 
including that of Wang and colleagues, show- 
ing them to be closely related within mam- 
mals?*’, whereas others place euharamiyidans 
onalineage that branched off before the com- 
monancestor of living mammals evolved®”. If 
the latter scenario is the case, then euharam- 
iyidans would represent a fourth instance of 
the independent evolution of a fully detached 
middle ear. 

The question of whether the similarities 
between the ears of /eholbaatar and Arboro- 
haramiya reflect a close relationship on the 
evolutionary tree or independent (conver- 
gent) evolution driven by similar chewing 
adaptations is further complicated by another 
consideration: the incus of living platypuses 
(Ornithorhynchus) and echidnas, or spiny 
anteaters (Tachyglossus), also lies above the 
malleus. These mammals belong to a group 
called monotremes, whose middle ear evolved 
independently of that of other mammals. 
Monotremes do not use a palinal chewing 
motion, and the teeth of fossil monotremes 
do not suggest that such a motion occurred 
in early members of that lineage’’. They 
might have this arrangement of their incus 
and malleus for reasons that are entirely 
different from those explaining the arrange- 
ment of these bones in multituberculates or 
euharamiyidans. Monotremes donot retaina 
recognizable surangular. If the similarities in 
the middle ears of Jeholbaatarand Arborohara- 
miya reflect the functional similarity in the way 
the animals chewed, the unfused surangular in 
Jeholbaatar and Arboroharamiya might simply 


reflect the rapidity with which the transition 
to detachment of the middle ear from the jaw 
occurred, spurred on by the increased effi- 
ciency in food processing that this complete 
separation would have provided. 
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Electronsin graphene 


go with the flow 


Klaus Ensslin 


Scattering between electrons in the material graphene can 
cause these particles to flow like a viscous liquid. Such flow, 
which has previously been detected using measurements of 
electrical resistance, has now been visualized. See p.75 


Water ina river shows a variety of flow patterns 
and whirls. Any obstacle inthe river, suchasa 
bridge pillar or simply a rough bank, will lead 
toa distinctive flow pattern. It has been com- 
paratively less obvious how electrons flow in 
a solid. But on page 75, Sulpizio et al.1 report 
an experiment in which the flow pattern of 
electrons in an electrical conductor is imaged. 

The electrical resistance of a metal is caused 
by electrons being scattered from impurities 
in the material’s atomic lattice or from lattice 
vibrations called phonons. However, it is not 
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scattering scattering 
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affected by electron-electron scattering. 
When two electrons scatter off each other, 
their individual momenta are changed by the 
scattering event. But the total momentum of 
the two electrons is conserved, as is the total 
momentum of a sea of electrons ina metal. 
Therefore, simply measuring the resist- 
ance of a metal will not unveil the effects of 
electron-electron scattering. 

To nail down these effects, materials need 
to be tuned to a regime in which electron- 
electron scattering is dominant and the 
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Figure 1 | Electron interactions in graphene. The material graphene consists of a single layer of carbon 
atoms arranged ina hexagonal lattice. Electrons flowing through graphene can be scattered from impurities 
(such as foreign atoms in the lattice), from other electrons and from lattice vibrations known as phonons. 

At low temperatures, electron-impurity scattering dominates. By contrast, at high temperatures, electron— 
phonon scattering takes over. Sulpizio et al.‘ report observations of graphene at intermediate temperatures 
for which the rate of electron-electron scattering is the largest among all scattering rates. 


© 2019 Springer Nature Limited. All rights reserved. 


Nature | Vol576 | 5 December 2019 | 45 


News & views 


electrons flow like a viscous liquid*’. At low 
temperatures, electron-electron (as well as 
electron-phonon) scattering is suppressed 
and electron-impurity scattering dominates. 
Conversely, at high temperatures, electron- 
phonon scattering takes over. For graphene 
(a single layer of carbon atoms arranged ina 
honeycomb lattice), there is an intermediate 
temperature range‘ (50-250 kelvin) for which 
the rate of electron-electron scattering is the 
highest among all scattering rates (Fig. 1). 
However, even in this case, the material’s 
resistance will not be modified by electron- 
electron scattering because of momentum 
conservation. 

One way to investigate the viscous-flow 
regime has been to measure a local resistance, 
knownas vicinity resistance*, onan extremely 
small scale. The value of this quantity changes 
sign in the case of viscous flow. Another 
option has been to observe an effect called 
superballistic resistance? for electrons flow- 
ing through a narrow opening in a material. 
Here, the resistance is reduced below the value 
expected for a ballistic system, in which there 
is effectively no scattering. Such pioneering 
experiments were crucial for demonstrating 
that viscous electron flow can be important 
in electron transport. However, they provide 
only indirect evidence for the existence of such 
flow and do not give insights into the spatial 
arrangements of flow patterns. 

Electrons passing through a sample of a 
conducting material are driven by an electric 
field. As a result, there is a voltage gradi- 
ent along the direction of current flow. 
Unfortunately, this local voltage gradient is 
independent of the flow regime. But when a 
weak magnetic field is applied to the sample, 
another voltage, known as a Hall voltage, is 
produced perpendicular to the direction of 
current flow. The spatial profile of the Hall 
voltage does provide information about the 
flow characteristics. 

Sulpizio and colleagues use a sensitive 
electric-field sensor that enables local probing 
of this Hall voltage. The sensor is an innovative 
technology developed by this research group*. 
It consists of an electronic device called a 
single-electron transistor, the conductance of 
which depends sensitively on its electrostatic 
environment. 

In the present work, the sensor is made 
from ultraclean carbon nanotubes. Individual 
electrons are confined within these nanotubes 
by electrodes. Such an arrangement provides 
the required sensitivity for detecting weak 
electric fields or voltage gradients, such as 
those associated with the Hall voltage. The 
spatial resolution of the sensor is limited by 
its size and the distance of the sensor to the 
object to be probed. 

Changing the temperature and the number 
of charge carriers per given area inthe sample 
induces different flow regimes, which lead to 
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different Hall-voltage profiles. Sulpizio et al. 
use this property to image local electric fields 
in a uniform layer of graphene, and inves- 
tigate the transition between the regime in 
which electron-electron scattering domi- 
nates and those in which electron-phonon 
or electron-impurity scattering takes over. 
The authors demonstrate experimentally 
how electron-electron scattering alters the 
Hall-voltage profile of a uniform conductor. 
Viscous flow in liquids leads to turbulence and 
whirls, depending onthe viscosity of the liquid 
and on obstacles to the flow. However, the 
observation of such features in electron 
transport is beyond the scope of the present 
work and could require different experimental 
tools, suchas sensitive magnetic-field sensors, 
or samples that have complex geometries. 
What do Sulpizio and colleagues’ results 
mean for our understanding of electron 
transport in conductors? In the viscous 
regime, the flow of electrons is described by 
a universal hydrodynamic concept known 
as Poiseuille flow. The authors’ imaging of 
electronic Poiseuille flow is a breakthrough 
in the study of electron transport as well as 
a demonstration of a sophisticated imaging 
technique that combines high spatial resolu- 
tion with extreme sensitivity. We now know 
that electron flow can be diffusive, ballistic or 
viscous, and that there are experimental tools 
for differentiating between these regimes. 


Neurodevelopment 


For solid-state systems in general, 
electron-electron interactions are relevant 
for phenomenaas diverse as ferromagnetism 
(the familiar type of magnetism found iniron 
bar magnets) and the fractional quantum Hall 
effect (whereby electrons ina strong magnetic 
field act together to behave like particles that 
havea fractional electric charge). The authors’ 
technique could also be used to investigate, 
ona local scale, the superconductivity that 
was discovered last year in a twisted bilayer 
of graphene’. The potential to extract local 
information about strongly interacting 
systems of electrons will have far-reaching 
consequences for this field. Further appli- 
cations of the technique could enable local 
probing of electric fields as they arise in com- 
plex quantum circuits — which might one day 
lead to a quantum computer. 
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Birth of amotor 


circuit visualized 


Kristen P. D’Elia & David Schoppik 


Asophisticated imaging pipeline has been developed to track 
neurons in early-stage zebrafish embryos over time and space. 
It reveals how newborn neurons come together to build a 


spinal cord capable of locomotion. 


Where a person comes from and what they do 
are often considered key parts of their iden- 
tity. Similarly, neurons can be categorized 
by both their developmental history and 
their role in the nervous system. But, just as 
knowing someone’s job title does not neces- 
sarily tell you what part they play inateam at 
work, knowing what role a neuron has does 
not mean that we understand how it comes 
together with other diverse neuron types to 
form circuits — for instance, to permit move- 
ment. Writing in Cell, Wan et al.’ describe an 
imaging protocol that will help researchers 
determine how neural circuits form. They 
use their method to comprehensively chart 
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motor-circuit assembly and emerging func- 
tion in the spinal cord of zebrafish. 

In vertebrate embryos, the first neuronal 
circuits to respond to sensory information and 
orchestrate movement are found in the spine’. 
These motor circuits are assembled from doz- 
ens of molecularly specialized types of neuron. 
Nonetheless, this is a relatively simple set-up, 
making it a useful system for studying how 
neuronal circuits come together to produce 
behaviour — inthis case, muscles contracting 
in distinct patterns. 

Wan et al. set out to study the formation 
of these early motor circuits in zebrafish 
embryos (Fig. 1). This research group has long 
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Figure 1 | Tracking the building blocks of a circuit. Wan et al. have developed an imaging and 
computational pipeline to track neurons of the zebrafish spinal cord, from their ‘birth’ 6 hours after embryo 
fertilization until they begin to show the coordinated activity of a motor circuit at 22 hours. The authors 
traced newly born sister cells (derived from the same immediate ancestor, indicated by different shades 

of the same colour). By 17 hours, the cells have migrated to their mature positions and adopted molecular 
characteristics of either motor neurons (star-shaped cell) or interneurons (circular cell). By 22 hours, the 
cells become wired into coordinated circuits (inset). Motor neurons are the first to become active, and the 
authors showed that they then imprint their activity onto other neurons (dotted arrows), leading these 


neurons to adopt the same activity pattern. 


been at the forefront of in vivo microscopy, 
pioneering light-sheet microscopy tech- 
niques that can illuminate all of the individual 
cells that make up developing organisms 
such as zebrafish without harming them. 
Zebrafish are well suited to such studies 
because they are small, transparent and 
develop rapidly. 

The researchers imaged zebrafish from 
6 hours after embryo fertilization, when 
spinal neurons first arise from their progen- 
itors, to 22 hours after fertilization, when 
the patterns of neuronal activity that trigger 
tail movements begin. The imaging process 
generated vast libraries of images that Wan 
and colleagues processed to extract infor- 
mation about the location of individual cells 
over time. In addition, the authors optimized 
their microscope design to allow them to 
measure emergent patterns of functional 
activity from individual cells. The result was 
a data set that enabled the group to track the 
organization and function of every cell in 
the zebrafish spinal cord throughout early 
development. 

Motor neurons and interneurons are key 
neuron types in spinal motor circuits. The for- 
mer are responsible for triggering muscle-fibre 
contraction and the latter coordinate signal- 
ling within and between circuits? (for example, 
to ensure alternating left-right movements 
during swimming). Motor neurons have often 
been thought of as passive cells controlled 
by upstream interneuron inputs, whereas 
interneurons had been thought to be the driv- 
ing force behind the assembly and function 
of spinal motor circuits*. But over the past 
few years, evidence has emerged that both 
developing?’ and mature® motor neurons can 
control their connections to interneurons, and 


even control interneuron activity. In zebrafish, 
motor neurons are the first spinal neurons to 
display spontaneous activity patterns’. Asa 
circuit develops, neurons often first become 
active on their own, and then coordinate 
their activity with that of other neurons. Wan 
et al. therefore asked whether this activity 
originates in the motor neurons themselves, 
or reflects interneuron control. 

The authors found that select motor 
neurons seem to impose their own activity 
on neighbouring motor neurons and inter- 
neurons, producing pairs of cells that have 
the same activity patterns. Thus, the earliest 
patterns of collective activity are initiated 
by motor neurons. This finding adds to the 
emerging picture of motor neurons as a 
fundamental driver of spinal-cord develop- 
ment. Consistent with previous findings, the 
authors also confirmed that interneurons 
coordinate the global patterns of activity 
necessary at later developmental stages for tail 
movement. 

One theory of neural development states 
that cells that have a shared ancestry are 
destined to have common connectivity, and 
to perform similar rolesina circuit®. Evidence 
for such determinism remains contentious, 
reflecting the challenge of tracing related neu- 
rons as they migrate’. But Wan and colleagues 
were able to investigate this issue, thanks to 
their ability to comprehensively track cells 
over time. 

The authors examined the activity of sister 
neurons — those that shared an immediate 
ancestor. In line with ideas of determinism, 
sister neurons that ended up in close proxi- 
mity to one another were more likely than 
unrelated neurons to be co-active. But, 
intriguingly, most sibling pairs did not remain 
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close to one another. Indeed, sister neurons 
were just as likely to migrate to opposite sides 
of the spinal cord, where they would partici- 
pate in different phases of movement. Thus, 
ancestry can explain only a small part of 
functional organization. That said, Wan and 
colleagues’ study is limited to the earliest 
part of development, well before zebrafish 
hatch and swim freely. It will be interesting to 
re-evaluate questions of ancestral determin- 
ism over longer periods of time. 

Another limitation of the authors’ technique 
is that their cutting-edge microscope is best 
suited to small model organisms. It would be 
interesting to analyse whether their findings 
also apply to more-complex organisms. How- 
ever, current microscopes cannot be used for 
such purposes. 

Notably, the group that performed the study 
(and the Janelia Research Campus in Ashburn, 
Virginia, at which it works) is committed to 
providing access to the microscope used in 
the current work. In addition, the authors’ 
data and analysis pipelines are available to 
download. Thus, other researchers can further 
assess the relationship between the develop- 
mental history and function outlined in the 
current study. 

Advances in the transcriptional profiling 
of single cells have revealed remarkable 
variability among neurons”, making circuit 
development ever-more fascinating but 
incredibly challenging to fully understand. 
Until we have a greater understanding of the 
molecular logic that enables neurons to form 
motor circuits, our ability to prevent, diagnose 
and treat disorders of movement will remain 
limited. The apparatus and analysis pipeline 
developed by Wan etal. present a technically 
demanding but demonstrably fruitful path 
towards better grasping how aneuron’s birth 
shapes its future role ina circuit. 
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Genetic engineering 


CRISPR tool enables 
precise genome editing 


Randall J. Platt 


The ultimate goal of genome editing is to be able to make any 
specific change to the blueprint of life. A ‘search-and-replace’ 
method for genome editing takes us a giant leap closer to this 


ambitious goal. See p.149 


Variation inthe DNA sequences that constitute 
the blueprint of life is essential to the fitness of 
any species, yet thousands of DNA alterations 
are thought to cause disease. After decades of 
researchin genetics and molecular biology, tre- 
mendous progress has been made in develop- 
ing genome-editing tools for correcting such 
alterations. But aseemingly fundamental limit 
to the efficiency and precision of gene editing 
was reached, owing to the tools’ reliance on 
complex and competing cellular processes. 
On page 149, Anzalone et al.' describe ‘search- 
and-replace’ genome editing, in which the 
marriage of two molecular machines enables 
the genome to be altered precisely. The 
technique has immediate and profound 
implications for the biomedical sciences. 

Human efforts to engineer genomes 
pre-date knowledge of genes or even of the 
source of heredity. The first genome engineer- 
ing relied on natural variation and artificial 
selection through selective breeding. Modern 
maize (corn), for example, was ‘engineered’ 
from its wild ancestor, teosinte, through arti- 
ficial selection more than 9,000 years ago’. 
Later progress was fuelled by the realiza- 
tion that DNA sequences shape life, and that 
evolution can be augmented and artificially 
accelerated through the use of mutagenic 
agents, such as radiation or chemicals. 

Next came the discovery that cellular 
processes for repairing mistakes in DNA 
sequences could be hijacked, allowing 
sequences froma foreign ‘template’ DNA to be 
inserted into the genomeat DNA breaks’. This 
process is greatly enhanced ifthe DNA is inten- 
tionally damaged*> — a finding that sparkeda 
search of more than 20 years for an enzyme 
that could specifically cut DNA at locations of 
interest. The search culminated in the adop- 
tion of the bacterial CRISPR-Cas9 system, in 
which the enzyme Cas9 uses a customizable 
RNA guide to search for DNA sequences to cut 
in human cells $ (Fig. 1a). 

CRISPR-Cas9 sparked a revolution in the 
biomedical sciences by making genome 
editing accessible to all researchers, but, 


48 | Nature | Vol576 | 5 December 2019 


ultimately, it is just a fancy pair of molecular 
scissors that cuts DNA. Because cuts in DNA 
are deadly to cells, they are urgently repaired 
by one of many independent pathways. In 
the context of genome editing, the desired 
outcome is for repair to be directed by atem- 
plate DNA, resulting in precise edits. But most 
cells prefer to use an alternative mechanism, 
in which the DNA template is ignored and 
the two broken ends of DNA are imperfectly 
stitched back together — a major limitation 
for genome editing. 

Much effort over the past few years has 
focused on shifting the balance from imperfect 
to precise editing. One effective strategy is to 
edit DNA without cutting both DNA strands 


in the helix — double-strand breaks are the 
main insult that leads to imperfect edits. A 
milestone in this regard was the development 
of base editing’, a process in which a version 
of the Cas9 enzyme that cuts only one DNA 
strand is combined with an enzyme that can 
switch one specific DNA base for another, 
near the nick site (Fig. 1b). However, the tech- 
nical constraints of base editing, and the need 
to modify more than just single DNA bases, 
meant that new genome-editing approaches 
were still desperately needed. 

Anzalone and co-workers nowlargely fill this 
need with a technique called prime editing. 
Their approach relies ona hybrid molecular 
machine consisting of a modified version 
of Cas9, which cuts only one of the two DNA 
strands, and a reverse transcriptase enzyme, 
which installs new and customizable DNA at 
the cut site (Fig. 1c). This marriage parallels a 
naturally occurring process in yeast, in which 
DNA that corresponds to an RNA sequence is 
incorporated into the genome by a reverse 
transcriptase”. 

The prime-editing process is orchestrated 
by an engineered, two-part RNA guide. The 
‘search’ part of the guide directs Cas9 toa 
specific sequence in the DNA target, where it 
cuts one of the two DNA strands. The reverse 
transcriptase then produces DNA complemen- 
tary to the sequence in the ‘replace’ part of the 
RNA guide, and installs it at one of the cut DNA 
ends, where it takes the place of the original 
DNA sequence. 
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Figure 1| Evolution of genome editing. a, In conventional genome editing, a Cas9 enzyme is directed to 

a position in the genome by a guide RNA, and produces a double-strand break. The host cell’s DNA-repair 
machinery fixes the break, guided by a template DNA, incorporating template sequences into the duplex. 
b, Inan approach called base editing, a Cas9 that produces only single-strand breaks (nicks) works witha 
deaminase enzyme. The deaminase chemically modifies a specific DNA base — here, a cytidine base (C) is 
converted to uracil (U). DNA repair then fixes the nick and converts a guanine-uracil (G-U) intermediate to 
an adenine-thymine (A-T) base pair. This method is more precise than a, but makes only single-nucleotide 
edits. c, Anzalone et al.' report prime editing, which can precisely edit DNA sequences. A nick-producing 
Cas9 and a reverse transcriptase enzyme produce nicked DNA into which sequences corresponding to the 
guide RNA have been incorporated. The original DNA sequence is cut off, and DNA repair then fixes the 
nicked strand to produce a fully edited duplex. In some cases, another nick is made in the unedited strand of 


the duplex before the DNA-repair step (not shown). 
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Atthis point, the duplex DNA being modified 
consists of two non-complementary strands: 
the edited strand, and the intact strand that 
wasn’t cut by Cas9. Non-complementary 
sequences are not tolerated in cells, so one 
of the strands must be fixed by DNA-repair 
processes to match the other, with the intact 
strand typically being preferentially retained. 
The authors therefore usually had to use a 
second RNA guide to direct a cut to the intact 
strand, to increase the chances that that 
strand would be repaired to match the edited 
sequence. The cut must be made strategically 
to avoid breaking both strands at the same 
time or place. 

Anzalone et al. demonstrate the versatility 
of prime editing by using it to efficiently and 
precisely install a wide range of sequences 
into DNA. For example, they used it in vitro 
inhuman embryonic kidney cells to correct 
the mutations that give rise to the blood 
disorder sickle-cell disease, and to edit the 
mutations that cause the neurological condi- 
tion Tay-Sachs disease. Imperfect edits were 
almost entirely avoided. The authors also 
carried out edits inhuman cancer cells and in 
mouse neurons in vitro. 

For decades, the potential of genome 
editing has been constrained by the difficulty 
of making precise modifications, and so appli- 
cations have focused heavily on situations in 


which imperfect DNA edits are useful. For 
example, such edits can be used to impair the 
function of a gene, providing an avenue for 
understanding its function”. Prime editing 
now makes it faster and easier than before to 
install or correct one or many specific muta- 
tions (such as those found in human patients, 
or synthetic sequences that are useful for 
research purposes). And it makes more cell 
types available for manipulation than was pre- 
viously possible. The chains that have shackled 
gene editing have thus come off — no doubt 
quickening the pace of research and enabling 
alist of new applications. 

Nevertheless, prime editing has limitations. 
First, the sophisticated, multi-step molecular 
dance that occurs between the prime-editing 
components is not yet predictable and doesn’t 
always turn out as intended. Imperfect random 
edits can therefore still arise, which means that 
several combinations of components might 
need to be tested, to work out the choreo- 
graphies required for each edit of interest. 
Second, delivering the large prime-editing sys- 
tem into some cell types could be challenging, 
given that many previous attempts have 
faltered with the conventional Cas9 system”, 
which is roughly half the size. 

For research purposes, these limitations are 
mostly just inconvenient, and will probably be 
overcome through follow-up work directed 


at better understanding and fine-tuning the 
method. For medical applications, however, 
these issues present a much greater challenge 
— imperfect DNA edits are unacceptable, and 
efficient delivery of the prime-editing sys- 
tem to cells will be crucial. So although prime 
editing certainly has the potential to give us 
unprecedented control over the blueprint of 
life, only time will tell whether it becomes just 
another tool inthe CRISPR toolbox or a cure-all 
for genetic diseases. 
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Hello Nature readers, 

Today we celebrate the launch of India's Moon mission, discover the 
protein-folding algorithms solving structures faster than ever and 
hear what not to doin graduate school. 


Artificial intelligence takes on protein folding 


Fresh approaches to dee; ad-to-head in arace 
to crack one of biology’ : predicting the 3D 
structures of proteins from their amino-acid sequences. These 
approaches are cheaper and faster than existing lab techniques suc| 
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Obesity and type 2 diabetes are the most frequent metabolic disorders, but their 
causes remain largely unclear. Insulin resistance, the common underlying 


abnormality, results from imbalance between energy intake and expenditure 
favouring nutrient-storage pathways, which evolved to maximize energy utilization 
and preserve adequate substrate supply to the brain. Initially, dysfunction of white 
adipose tissue and circulating metabolites modulate tissue communication and 
insulin signalling. However, when the energy imbalance is chronic, mechanisms such 
as inflammatory pathways accelerate these abnormalities. Here we summarize 
recent studies providing insights into insulin resistance and increased hepatic 
gluconeogenesis associated with obesity and type 2 diabetes, focusing on data from 
humans and relevant animal models. 
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Over the past 50 years, the prevalence of diabetes mellitus has continued 
toincrease, spreading from western countries to the western Pacific, Asia 
and Africa. Current projections estimate an increase of more than 50% 
between 2017 and 2045, leading to around 693 million people suffering 
from diabetes, with estimated healthcare costs of about US$850 billion 
per year’. This epidemic mainly results from an increase in the incidence of 
type 2 diabetes (T2D), aheterogeneous disease characterized by deficient 
insulin secretion by pancreatic islet B-cells in the context of impaired 
insulin sensitivity, termed insulin resistance. A genome-wide association 
study (GWAS) found morethan 400 T2D-associated gene variants—mostly 
related to islet function, but the roles of the individual genes are minor 
and explain less than 20% of overall disease risk’. Lifestyle-modification 
studies demonstrating T2D remission underline the predominant role 
of acquired alterations, including intake of highly palatable, energy- 
dense refined food, sedentary behaviour and other factors (for example, 
environmental pollution, socioeconomic and psychosocial conditions, 
smoking and sleep deprivation)°. Moreover, parental lifestyle, intrauterine 
programming and early postnatal metabolic alterations may influence the 
risk profile’ viaDNA methylation’. The roles of these mechanisms and of 
the gut metagenome are controversial in humans and beyond the scope 
of this Review. This Review focuses on studies in humans and relevant 
rodent models to provide an outlook on future precision medicine for 
T2D by better understanding its pathogenesis. 


Fed-to-fasting transition and insulin resistance 


Fasted humans display impaired insulin-stimulated glucose dis- 
posal and elevation of certain, mainly branched-chain, amino acids 


and nonesterified fatty acids (NEFA) in plasma despite low-to-normal 
glycaemia and hypoinsulinaemia®. While initially hepatic glycogen- 
olysis and gluconeogenesis maintain normoglycaemia’, the shift from 
carbohydrate to fatty acid oxidation preserves glucose for obligate 
glucose utilizers (such as the brain, red blood cells and renal medulla) 
and essential protein stores, which would otherwise be used for glu- 
coneogenesis (Fig. 1a). Stimulation of gluconeogenesis has mostly 
been attributed to decreasing plasma insulin and increasing plasma 
glucagon concentrations favouring gluconeogenic enzyme transcrip- 
tion. Recent studies in rats have demonstrated a critical role of the 
leptin-hypothalamic-pituitary—adrenal (HPA) axis in mediating the 
fed-to-fasting transition via glucocorticoid regulation of white adipose 
tissue (WAT) lipolysis’° °—similar to mechanisms that operate in uncon- 
trolled diabetes”. These studies show that the early postabsorptive 
decline in hepatic glycogenolysis is predominantly responsible for the 
fallin plasma insulin and glucose concentrations, resulting in approxi- 
mately 50% reduction in plasma leptin concentrations. Thus, leptin acts 
as important fuel gauge for energy stored as triacylglycerol (TAG) in 
WAT andas glycogen in liver, signalling to the brain when both energy 
depots are depleted”. The fall in leptin to less than 1ng mI stimulates 
the HPA axis, ultimately increasing plasma corticosterone concentra- 
tions, which, during hypoinsulinaemia, stimulates WAT lipolysis, and 
release of NEFA and glycerol, with a switch towards fatty acid oxida- 
tion. Increased fatty acid flux to the liver increases hepatic B-oxidation 
and acetyl-CoA content (Fig. 1b). This allosterically activates pyruvate 
carboxylase flux, which, along with increased glycerol flux to the liver, 
is essential for maintaining hepatic gluconeogenesis and endogenous 
glucose production (EGP) during starvation. Simulation of fasting con- 
ditions also increased the contribution of gluconeogenesis to EGP by up 
to 75%, probably owing to lipid-dependent control of hepatic glycogen 
stores, Starvation also promotes hepatic accumulation of TAG and 
diacylglycerol (DAG), which can occur independently of the direct 
action of hepatic insulin on de novo lipogenesis (DNL)'°. Subsequently, 
the novel protein kinase C isoform € (PKCe) is translocated to the plasma 
membrane, where it binds to and phosphorylates Thr1160 of the insulin 
receptor (IR), thereby inhibiting IR kinase activity” (Fig. 1b). Prolonged 


‘Division of Endocrinology and Diabetology, Medical Faculty, Heinrich-Heine University, Diisseldorf, Germany. “Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for 
Diabetes Research at Heinrich-Heine University, Diisseldorf, Germany. °German Center for Diabetes Research, Partner Diisseldorf, Diisseldorf, Germany. “Departments of Internal Medicine and 
Cellular and Molecular Physiology, Yale Diabetes Research Center, Yale School of Medicine, New Haven, CT, USA. *e-mail: michael.roden@ddz.de; gerald.shulman@yale.edu 


Nature | Vol576 | 5 December 2019 | 51 


Review 


Insulin 
a 
©) Acute: Hepatocy 
direct insulin 
ace GD. leet 
Fan oceania {@GSkso/p F--5 1OGpKs 
Chronic: insulin : i . 
effect on DNL wv @ 2 FOXO1 ; 1 
t 
ee ae an aes Sialssiemkep-~, ,---------- (estas <-------------------------- 11 Glycogen H 
' t 
o 4 | ' 
pS Sct e= cen eseediag = re See eeSss etaese Naeeeres saat x - = ; s 7 
VLDL ; Chronic: direct <«- 
“ insulin effect ial ad GS GP 
| GPAT1 ACC1 aa “Con on GNG a 
eo y it uh 
ae ieee a 
~ 4 
" Q 
“ie ® i FOXO1 
@ Insulin MT " 1 
i | Y ee v 
NS ( 
a use |qt PEPCK || 1 FBPase |-> 1 GeP [! G6Pase >| Glucose 


NEFA 


J 


{| FA-CoA —————— > || Acetyl-GoA--~’ 


> | Glycerol 


@ 
—" 


T Glucose 


> FA-CoA 


Acute: indirect 
insulin effect on GNG 


—S 
SREBP1c 


Adipocyte 


Insulin 


LPK “LPK | 


E v 
ROS ao tit i PC i 7 PEPCK une 1 FBPase }> 1 cs —f'G6Pase]} ——> ciucose 


b 
hi 
yy ee VLDL 
*s GPAT1 2 FAS MOA cet 
i “COR 
TNF, IL-6, IL-1B \ 103, gt A Malonyl- 
‘ \ CoA eee < 
a, 
1 
T1160, Insulin : 
sa 


T Glycerol 


SREBP1c 


fasting also resulted in 60% lower rates of glucose—alanine cycling, with 
a 50% reduction in hepatic mitochondrial oxidation, demonstrating 
an interorgan link between liver and muscle during the fed-to-fasting 
transition in bothrats” and humans®”’. Unbiased metabolomic analysis 
suggests that the same sequence of events occurs in ten-day fasted 
humans and reveals discrete starvation phases with gluconeogenic 
amino acid consumption and subsequent surge in lipids with a high 
degree of unsaturation and chain length”, reflecting increased adi- 
pocyte NEFA release. This study also reported a rapid fall of around 50% 
in circulating leptin and an early rise in plasma glucocorticoids, similar 
tothat occurring in fasting”° and anorexia nervosa”. Although circulat- 
ing leptin tightly reflects WAT mass, fasting-induced hypoleptinaemia 
can occur independently and initiate neuroendocrine adaptation”. 

This mechanism may have been important for survival during peri- 
ods of famine and explain the high evolutionary conservation of the 
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Thr1160 residue in the catalytic loop of the IR. Likewise, the blind cave 
fish Astyanax mexicanus—which endures infrequent limited nutrient 
availability—develop hyperglycaemia, steatosis and insulin resistance 
owing toa mutation in its IR gene”, similar to the one observed in some 
patients with Rabson—Mendenhall syndrome”. Despite abnormal energy 
metabolism, these fish show delayed senescence, further supporting the 
survival benefit of limited insulin-dependent glucose disposal. Thus, DAG 
and novel PKC (nPKC)-induced insulin resistance may have served a key 
evolutionary role to promote survival during starvation, while favouring 
metabolic syndrome and T2D during overnutrition. 


Transition from normoglycaemia to hyperglycaemia 


Longitudinal studies have demonstrated that people who later 
develop T2D display gradually increasing fasting and postprandial 


Fig. 1| Adipose-liver interaction under insulin-sensitive and insulin- 
resistant conditions. a, Under physiological postprandial conditions, insulin 
rapidly stimulates lipid storage by inhibiting lipolysis via adipose triglyceride 
lipase (ATGL and CGI-58), phosphodiesterase 3B (PDE3B) and protein kinaseA 
(PKA)-controlled hormone-sensitive lipase (HSL) and perilipins (PLINs), and 
stimulating lipogenesis (1). Lower NEFA-glycerol flux decreases hepatic acetyl- 
CoA and glycerol, thereby acutely diminishing gluconeogenesis (GNG), 
reflecting indirect insulin action (2). Direct hepatic IR activation stimulates 
glycogen synthesis (GLY) (3) and chronic transcriptional control via decreased 
FOXO1to downregulate gluconeogenic enzymes and upregulate glucokinase 
(GK), increasing glucose-6-phosphate (G6P) levels (4). Inhibition of serine 
phosphorylation of GSK3 activates and increases glycogen synthase (GS) flux 
(3). Furthermore, glucose-6-phosphate allosterically activates glycogen 
synthase and inhibits glycogen phosphorylase (GP), resulting in glycogen 
storage and suppressed glucose production. Insulin further participatesin 
protein synthesis via mTORC1 (5) and DNL via FOXO1, carbohydrate and sterol 
response element binding proteins (ChREBP and SREBPIc) (6). b, Ininsulin- 
resistant states (obesity and T2D), adipocyte dysfunction—for example, owing 
torelative hypoxia induced by saturated fat-stimulated mitochondrial ANT2 
triggering the transcription factor HIF-la—leads to chemokine secretion, 
attracting macrophages (M®) (1). Immune cell infiltration inhibits adipocyte 
insulin sensitivity by mechanisms that ultimately increase lipolysis and NEFA 


glycaemia”°*®. Insulin sensitivity, which is predominately depend- 
ent on age, sex and weight gain, declines decades before T2D onset, 
represents one of the earliest pathogenic events and can be mostly 
ascribed to reduced nonoxidative glucose metabolism” resulting from 
impaired insulin-stimulated storage of ingested carbohydrate as muscle 
glycogen” (Fig. 2b). Initially, B-cells compensate for insulin resistance 
by secreting more insulin, resulting in hyperinsulinaemia, which pro- 
motes hepatic DNL, steatosis, hyperlipidaemia and WAT expansion”®. 
WAT dysfunction, due to insulin resistance and inflammation“*!”, 
stimulates lipolysis, further aggravating hepatic insulin resistance 
and non-alcoholic fatty liver disease (NAFLD) (Fig. 1b). Additionally, 
increased NEFA and/or glycerol flux to the liver stimulates gluconeo- 
genesis. Combined with declining B-cell function—at least partly due 
to glucolipotoxicity—typically occurring just before T2D onset?’”®, 
this leads to fasting and postprandial hyperglycaemia. There appear 
to be population-specific differences (for example, quicker decline 
in B-cell function in African Americans” and increased hepatic lipid 
accumulation and muscle insulin resistance despite lower bodyweight 
in Asian Indians”®***), Nevertheless, without weight loss, insulin resist- 
ance and B-cell dysfunction occur simultaneously and continuously, 
increasing the risk of comorbidities even before glycaemia exceeds 
current criteria defining diabetes. 


Identifying distinct diabetes phenotypes 

Recent studies challenge the traditional concept of T2Das single entity, 
as patients already exhibit a broad variability in insulin secretion and 
sensitivity at diagnosis». Unbiased cluster analyses discriminated 
subgroups with different degrees of insulin deficiency and moderate 
obesity-related, moderate age-related or severe insulin resistance**®. 
Whereas no known diabetes gene variants were associated with all clus- 
ters,a 7CF7L2 variant related to insulin deficiency and a TM6SF2 variant 
related to the severely insulin-resistant cluster predicted nephropathy, 
cardiovascular disease*®* and NAFLD®. Soft clustering analyses point to 
further gene-phenotype associations” underlining different patho- 
genic mechanisms. 


Postprandial hepatic metabolite fluxes 

Fasting hyperglycaemiain T2D results from increased rates of hepatic 
gluconeogenesis and EGP and from hepatic insulin resistance, char- 
acterized by reduced ability of insulin to suppress this process***1, 


and glycerol flux to the liver. Here, acetyl-CoA allosterically stimulates 
pyruvate carboxylase flux (PC), subsequently raising fasting glucose 
production (2). Increased lipid re-esterification generates DAG, thereby 
activating PKCe translocation and inhibitory Thr1160 phosphorylation of the 
IR (3) and increasing production of ceramides (4) and ROS (5), collectively 
promoting insulin resistance. Inhibiting direct hepatocellular insulin action 
acutely favours glycogenolysis and chronically upregulates gluconeogenesis 
with postprandial and finally continuous hyperglycaemia (6). In parallel, 
hepatic TAG deposition increases, not only from augmented lipid availability 
and DNL, partly controlled by insulin and FOXO1, but also via nutrient-sensitive 
pathways (ChREBP, SREBPIc and mTORC1) (7). Finally, endoplasmic reticulum 
(ER)-derived factors (PKR-like eukaryotic initiation factor 2a kinase (PERK), 
inositol-requiring enzyme 1 (IRE1) and activating-transcription factor 6 (ATF6)) 
can induce an unfolded protein response, which may stimulate lipogenesis, 
supported by X-box binding protein 1(XBP1) and inflammatory pathways. All of 
these mechanisms accelerate TAG accumulation and NAFLD progression. 
Dotted lines represent regulation, that is, stimulation or inhibition; thicker 
lines represent pathways with increased flux, thinner lines represent pathways 
with decreased flux. FAS, fatty acid synthase; PEPCK, phosphoenolpyruvate 
carboxykinase; FBPase, fructose-1,2-bis-phosphatase; LPK, L-pyruvate kinase. 
P, phosphorylation. 


This may be because of direct IR-mediated cell-autonomous or indirect 
effects (substrate availability, allosteric regulation or redox status)” 
(Fig. 1b). Recent studies showed that these indirect effects probably 
result from insulin action on WAT and mainly account for acute suppres- 
sion of gluconeogenesis and EGP during postprandial hyperinsulinae- 
mia"*. Consistent witha minor role for direct hepatic effects of insulin, 
rodent models with altered hepatic insulin signalling exhibit relatively 
normal glucose tolerance and compensatory hyperinsulinaemia, with 
reduced hepatic glycogen synthesis as the only indication of disrupted 
insulin signalling? ”, 

Direct assessment of glycogen synthesis by “C magnetic resonance 
spectroscopy demonstrated lower rates of postprandial and insulin- 
regulated hepatic glycogen synthesis in people with T2D**”. The higher 
half-maximal effective concentration and lower maximum effect of 
insulin on hepatic glycogen synthesis” indicate impaired IR activa- 
tion with subsequent posttranslational modifications of the glycogen 
synthetic machinery and transcriptional regulation of glucokinase 
(Fig. 1b). Whereas other insulin effects, such as transcriptional DNL 
activation via sterol receptor enhancing binding protein-1c (SREBPIc), 
would be expected to be blunted, hepatic insulin resistance is gener- 
ally associated with increased hepatic TAG and NAFLD. Accordingly, 
it has been proposed that only the FOXO1-dependent, but not the 
SREBP1c-dependent branch of insulin signalling, is defective, sug- 
gesting selective hepatic insulin resistance*’. This hypothesis relies 
on the assumption that DNL is the major source of hepatic TAG and 
on experiments showing different roles of insulin receptor substrate 
(IRS)-1 and IRS-2, substrate-specific AKT phosphorylation or intrinsic 
pathway sensitivities to insulin. Conversely, NEFA re-esterification 
probably accounts for the majority of hepatic lipogenesis and very low- 
density lipoprotein (VLDL) secretion” >. Decreased insulin-stimulated 
hepatic IR kinase activity suggests acommon proximal abnormality in 
T2D™. Furthermore, DNL upregulation is not dependent exclusively on 
IR kinase activity, but can also occur through activation of carbohydrate 
receptor enhancing binding protein (ChREBP)”, mTORC1-SREBP1c™* 
and fructose-stimulated pathways” (Fig. 1b). A recent study found that 
fatty acid esterification to TAG is mostly dependent on NEFA delivery to 
the liver and independent of hepatic insulin signalling’®. This alternative 
hypothesis also explains the development of NAFLD through increased 
NEFA flux derived from increased lipolysis by insulin-resistant WAT. 

Inadditionto caloric overload, macronutrients exert specific effects 
by modulating enteroendocrine secretion and, in turn, pancreatic 
islet and brain function before reaching the splanchnic bed to directly 
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Fig. 2| Skeletal muscle metabolism under insulin-sensitive and insulin- 
resistant conditions. a, Under physiological postprandial conditions, insulin 
stimulates autophosphorylation of its receptor, tyrosine phosphorylation of 
IRS-1and PI3K and serine/threonine phosphorylation of AKT2, causing several 
post-receptor events (1). Inhibition of AS160 threonine phosphorylation and 
RAC1activation accelerates trafficking of GLUT4-containing storage vesicles 
(GSV) to the plasma membrane (2), enabling increases in glucose uptake via 
GLUT4 (3) and in intracellular glucose-6-phosphate levels via hexokinase Il. 
Inhibition of serine phosphorylation of GSK3 activates glycogen synthase flux 
(4). Glucose-6-phosphate allosterically activates glycogen synthase (5) and 
inhibits glycogen phosphorylase, resulting in postprandial glycogen storage. 
b, Ininsulin-resistant states (obesity or T2D), overnutrition increases NEFA 
uptake and TAG storage (1). Starting from fatty acyl-CoA (FA-CoA), lipid 
synthesis occurs in different subcellular compartments and yields DAGs (2), 
which accumulate in the plasma membrane and stimulate translocation of 
novel PKC isoforms (PKC@ and PKCe) to the plasma membrane (3), thereby 
inducing inhibitory serine phosphorylation of IRS1 (4). Via serine-palmitoyl 
transferase (SPT), saturated fatty acids (SFA) can undergo synthesis of 


stimulate insulin secretion and entering the liver. Only around 33% of 
dietary carbohydrates enter the liver, and dietary fat is considered 
to amount to only 10-20% of the hepatic fatty acid pool”. Neverthe- 
less, macronutrients can deliver substrates for the hepatic acetyl-CoA 
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ceramides, which may also arise via sphingomyelin and salvage pathways (5). 
Ceramides activate protein phosphatase (PP2A) and PKCZ inhibiting AKT2 
phosphorylation. Amino acids (AA) inhibit IRS1 activation via the mIOR-p70S6 
serine kinase (S6K) pathway (6). Independently, inherited and acquired factors 
can lead to abnormal mitochondrial function (7) accelerating accumulation of 
DAG and, potentially, acylcarnitine (from incomplete B- oxidation) (8). Finally, 
chronically elevated reactive oxygen species (ROS) can inhibit IRS1 
phosphorylation via NF-KB and JNK pathways (9). These effects combine to 
decrease glucose transport and glycogen synthesis, thereby contributing to 
postprandial hyperglycaemia. Dotted lines represent regulation, that is, 
stimulation or inhibition; thicker lines represent increased (flux through) 
pathways, thinner lines represent decreased (flux through) pathways. CY, 
cytosol; ER, endoplasmic reticulum; LD, lipid droplet; MT, mitochondria; PM, 
plasma membrane; G3P, glycerol-3-phosphate; LPA, lysophosphatidic acid; PA, 
phosphatidic acid; GPAT, acyl-CoA:glycerol-3-phosphate acyltransferase, 
AGPAT: acyl-CoA:1-acyl-glycerol-3-phosphate acyltransferase; DGAT2; DAG- 
O-acyltransferase; PAP; phosphatidate phosphatase; MGL, monoacylglycerol 
lipase. 


pool, which allosterically stimulates gluconeogenesis or activates 
nutrient-sensitive pathways (ChREBP, mTORC and SREBP) to collec- 
tively stimulate the transcriptional DNL program. Elevated hepatic 
acyl-CoA favours production of sn-1,2-DAG, sphingolipids and TAG. 


In obese humans with NAFLD, the sn-1,2-DAG-PKCe pathway tightly 
correlates with hepatic insulin resistance®’ ©°, whereas ceramide-JUN 
N-terminal kinase (JNK) correlates more with hepatic oxidative stress 
and inflammation®*!” (Fig. 1b). In this context, lowering cellular cera- 
mide by ablating dihydroceramide desaturase 1 increased mitochon- 
drial oxygen flux and improved steatosis and glucose metabolism in 
insulin-resistant mice®. Conversely, mitochondrial C16:0 ceramide, 
generated by overexpression of ceramide synthase 6 (CerS6), inter- 
acts with mitochondrial fission factor (MFF) to promote mitochon- 
drial fragmentation, insulin resistance and steatosis™. Silencing of 
MFF prevented CerS6-dependent metabolic abnormalities despite 
elevated C16:0 ceramide. This suggests that the effects of ceramides 
oninsulin-stimulated glucose metabolism might result indirectly from 
impaired mitochondrial function with lower fatty acid oxidation, giv- 
ing rise to other metabolites, for example, sn-1,2-DAG or acetyl-CoA, 
rather than from direct ceramide interference with insulin signalling. 
Recent studies indicate a critical role of molecular compartmentation 
of sn-1,2-DAGs, specifically inthe plasma membrane, in inducing nPKC 
translocation and insulin resistance. Mice treated with CGI-58 antisense 
oligonucleotide exhibit elevated hepatic TAG and DAG in lipid droplets, 
are protected from lipid-induced hepatic insulin resistance and show 
reductions in plasma membrane DAG and PKCée translocation®. 
Alvarez-Hernandezet al. monitored the earliest diet-induced meta- 
bolic alterations by examining the effect of a single oral saturated fat 
loadin healthy humans“. This study revealed that saturated fat simul- 
taneously induces insulin resistance in liver, skeletal muscle and WAT, 
and is associated with 70% higher rates of hepatic gluconeogenesis and 
20% lower rates of net hepatic glycogenolysis. Similar studies in mice 
found upregulated expression of toll-like receptor (TLR) and inflam- 
matory pathways, which might contribute to progression of NAFLD, 
including non-alcoholic steatohepatitis (NASH)®. Of note, chronic 
overfeeding also increased levels of intestine-derived endotoxins 
promoting TLR4-induced cytokine release by Kupffer cells®**. Other 
intestinal functions also affect glycaemia and diabetes risk: integrin 
67-knockout mice, which lack natural small-intestinal intraepithelial 
Tlymphocytes, are metabolically hyperactive and resistant to obesity 
and diabetes”. Finally, dietary habits may affect the gut microbiota, 
modulating intestinal metabolite release and insulin sensitivity”. 
Humans with T2D and NAFLD show distinct metagenomic signatures 
along with increased branched-chain amino acids”’” and decreased 
short-chain NEFA”, which may affect body weight and metabolism. 
Insummary, overnutrition and WAT dysfunction lead to increased 
WAT lipolysis, which promotes insulin-independent hepatic lipogenesis 
resulting in increased ectopic lipid deposition and increased hepatic 
gluconeogenesis owing to increased increased acetyl-CoA stimula- 
tion of pyruvate carboxylase as well as increased glycerol conversion 
to glucose. This mechanism obviates the previously reported need to 
invoke selective hepatic insulin resistance to explain the discordance of 
increased hepatic lipogenesis occurring simultaneously with increased 
gluconeogenesis“ (Fig. 1b). This is in line with recent studies showing 
that weight loss caused by very-low caloric diets rapidly normalizes 
hepatic steatosis and insulin resistance in liver, but not intramyocellular 
lipid content or muscle insulin resistance in individuals with T2D°>". 


Insulin resistance in skeletal muscle 


Studies using °C and*"P magnetic resonance spectroscopy identified 
impaired insulin-stimulated glycogen synthesis as the major factor 
responsible for insulin resistance in muscle and reduced insulin- 
stimulated glucose transport activity as the rate-controlling step that 
underlies lower glycogen synthesis in patients with T2D and their insu- 
lin-resistant relatives” ” (Fig. 2a). Reduced insulin-stimulated glucose 
transport can be mainly attributed to defective insulin signalling at 
the level of IR and IRS-1-associated PI3K, which has been observed in 
one study to occur without altered AKT phosphorylation®®. Whereas 


the majority of studies in humans point to proximal defects in insu- 
lin signalling, some experimental models provide evidence for distal 
abnormalities*'*”. Glycogen synthase can also be stimulated via insulin 
regulation of glycogen synthase kinase-3 (GSK3) or independently via 
allosteric activation by glucose-6-phosphate® in skeletal muscle” ’7** 
and liver?®**, but its activity does not appear to regulate insulin-stimu- 
lated glucose disposal (Figs. 1, 2). 

Lean first-degree relatives of patients with T2D present with predomi- 
nantly muscle insulin resistance”. Ingestion of two high-carbohydrate 
meals revealed their early metabolic abnormalities: ingested carbo- 
hydrates were diverted from muscle glycogen synthesis to the liver, 
where augmented carbohydrate availability and compensatory hyper- 
insulinaemia promoted hepatic DNL, hepatic TAG synthesis and VLDL 
secretion, hypertriglyceridaemia and reduced plasma high-density 
lipoproteins’. The critical importance of skeletal muscle is illustrated 
by the observation that a single bout of exercise, which activates AMP 
kinase, promotes translocation of the glucose transporter GLUT4 and 
glucose uptake independently of insulin*®*, completely reversed these 
abnormalities”*”. Insulin-resistant individuals also exhibit reduced 
muscle mitochondrial density, gene expression and function, which 
impedes lipid oxidation. This, combined with augmented hepatic 
TAG release, contributes to muscle lipid accumulation. Collectively, 
these findings suggest a specific phenotype, whereby genetic and/ 
or acquired reductions in muscle mitochondrial function predispose 
these individuals to sn-1,2-DAG accumulation, activation of PKC@ and 
PKCe and insulin resistance in muscle, which can be enhanced further 
by excessive production of reactive oxygen species® (Fig. 2b). Such 
selective muscle insulin resistance also increases cardiometabolic 
risk owing to increased TAG and VLDL production and subsequent 
dyslipidaemia. The association between muscle insulin resistance and 
abnormal mitochondrial function represents a frequently observed 
feature of the elderly and people prone to or with overt T2D”*’. 

There is increasing evidence supporting a hypothesis whereby 
gene variants in mitochondrial DNA (mtDNA) and mitochondrial- 
function-related nuclear DNA contribute to insulin resistance and 
T2D” or abnormal exercise-induced responses”. Gene variants in 
mitochondrial-function-related nuclear DNA lead to relatively mildly 
impaired mitochondrial function, whereas classical mtDNA gene 
variants typically cause a severe reduction in mitochondrial function 
with neurological deficits and B-cell failure. In contrast to genetic and 
acquired alterations that lead to mild impairments in mitochondrial 
activity anda predisposition to ectopic lipid accumulation and insulin 
resistance, alterations that lead to severe reductions in mitochondrial 
activity (for example, mtDNA variants) result in increased depend- 
ency on anaerobic glycolysis, hyperlactaemia and increased glucose 
metabolism” ™*. In support of this hypothesis, arecent European GWAS 
reported that anonsynonymous variant of N-acetyltransferase 2 (NA72) 
is associated with insulin resistance and related traits as well as with 
decreased adipocyte differentiation, insulin-mediated glucose uptake 
and increased WAT lipolysis”. Silencing or knocking down the mouse 
NAT2 orthologue, NAT1, induces insulin resistance, glucose intolerance 
and exercise intolerance”°’”’, and is associated with ectopic accumula- 
tion of TAG and DAG accumulation and activation of hepatic PKCe and 
muscle PKC8”. Nat1 mice also display mild reductions in mitochon- 
drial function and altered morphology, demonstrating another genetic 
link between reduced mitochondrial function, TAG and DAG deposi- 
tion and nPKC-induced liver and muscle insulin resistance’’”. Further 
supporting a relevant role of mitochondria for the development of 
insulin resistance, mice with muscle-specific knockout of sarcolipin, 
which is required for mitochondrial sarcoendoplasmic reticulum Ca**- 
ATPase (SERCA) uncoupling and lipid oxidation, develop obesity and 
DAG-nPKC-mediated muscle insulin resistance, whereas sarcolipin 
overexpression prevents obesity-induced insulin resistance”®. Other 
gene variants may also predispose humans to muscle insulin resistance 
and T2D independently from altered mitochondrial function, suchas 
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the AKT2 partial loss-of-function mutation that results in lower insulin- 
stimulated muscle and adipose glucose uptake”, whereas an activat- 
ing mutation causes fasting hypoglycaemia’. AS160 (also known as 
TBCID4) gene variants suggest links between insulin signalling and 
glucose transport leading to muscle insulin resistance, postprandial 
hyperinsulinaemia and hyperglycaemia’. Furthermore, RACI-medi- 
ated glucose transport can become dysregulated in insulin-resistant 
murine and human skeletal muscle’ (Fig. 2a,b). These genotypes may 
have been advantageous for preserving glucose for other tissues in the 
prehistoric arctic environment. 


Adipose tissue dysfunction 


Similar to skeletal muscle, insulin-resistant humans exhibit reductions 
inmembrane IR content, IR tyrosine kinase activity and insulin-stimu- 
lated glucose uptake in adipose tissue™* ”. Although WAT accounts for 
less than 5% of postprandial glucose disposal, it has a disproportionate 
role in regulating whole-body glucose metabolism through its ability 
to alter rates of hepatic gluconeogenesis through NEFA and glycerol 
release’? Furthermore, increased WAT glucose uptake can enhance 
ChREBP-mediated lipogenesis, provide glycerol-3-phosphate for NEFA 
esterification or serve as signal for adipokines’°*. Other pathways, such 
as insulin-regulated §-adrenergic lipolysis or cytokine interaction, 
may also contribute to dysregulated WAT metabolism”*"™. Insulin 
resistance in WAT shows temporal differences in the direction of net 
NEFA flux across the capillary bed between fasting and postprandial 
states, indicating lower fluctuations in obese men with impaired net 
adipose fat storage". Finally, compartment-specific differences with 
higher WAT lipolysis and lower lipogenesis in visceral versus subcutane- 
ous adipose tissue could enhance portal lipid delivery to the liver and 
contribute to metabolic dysregulation”. 

Despite the association between body fat mass and insulin resistance, 
there is accumulating evidence that abnormal adipocyte function as 
wellas liver and muscle lipid metabolites (including sn-1,2-DAG and 
ceramide), but not fat mass per se underlie common insulin resistance. 
In addition to lifestyle, GWAS studies suggest that genetic variants may 
affect the association of body fat mass or ectopic fat distribution with 
glycaemia and insulin resistance”. Some gene loci recently identified 
to associate with insulin resistance are associated with insulin resist- 
ance at a given level of adiposity and modulate insulin sensitivity via 
adipocyte differentiation”, supporting the concept that limited WAT 
storage capacity rather than overall fat mass is the main contributor to 
insulin resistance and associated diseases. Epigenetic adipose tissue 
modifications may further influence these interactions”. 

Enlarged WAT mass and adipocyte size has been linked with inade- 
quate vascularization, hypoxia, fibrosis and/or macrophage infiltration 
with low-grade inflammation" (Fig. 1b). High-fat diet and obesity may 
activate saturated fatty acid-stimulated adenine nucleotide translocase 
2 (ANT2), an inner mitochondrial membrane protein, which results 
in relative adipocyte hypoxia and triggers the transcription factor 
hypoxia-inducible factor-1a (HIF-1a), setting off adipose dysfunction 
and inflammation’. Adipocyte-specific Ant2 (also known as S/c25a5) 
deletion improves obesity-induced adipocyte hypoxia by lowering 
oxygen demand—despite unchanged mitochondrial mass—and, in turn, 
inflammation and insulin resistance”. This suggests that fatty acid- 
mediated ANT2 activation may be an early event in adipocyte dysfunc- 
tion and a possible target for novel insulin sensitizers or anti-obesity 
drugs. Other early events comprise mechanical stress on membranes 
and extracellular matrix, causing dynamic adaptation until adipocyte 
death, apoptosis or de-differentiation’®®. During this process, release of 
chemotactic signals attracts bone marrow-derived pro-inflammatory 
M1 macrophages, leading to adipose remodelling by a wide range of 
activities including PPARy-driven lipid storage, extracellular matrix 
modification, lysosomal clearance of dead adipocytes and cytokine 
release (Fig. 3). Compared with acute clinical inflammation, metabolic 
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macrophage activation exhibits distinct activation modes”, involving 
chemokine monocyte chemotactic protein 1 (MCP-1) and its receptor 
CCR2"8 B2lymphocytes”” and/or interferon-y and tumour necrosis fac- 
tor (TNF), produced by natural killer cells in visceral adipose tissue”? 
Local cytokine release (TNF, interleukins (IL)-18 and IL-6) within WAT can 
suffice to induce adipose insulin resistance without overflow-related 
effects in distant tissues, usually associated with so-called subclinical 
or low-grade inflammation. In line with these results, insulin-resistant 
obese adolescents displayed more than 20-fold higher IL-6 levels in adi- 
pose tissue than in plasma, and increased adipose tissue expression of 
CGI-58 protein, similar to observations in high-fat fed rats’. Of course, 
continued adipose tissue enlargement and concomitant stress will lead 
to cytokine overflow, creating an imbalance between insulin-sensitizing 
(including adiponectin and leptin) and pro-inflammatory adipokines 
(including RBP4, resistin, IL-6 and TNF). These endocrine effects have 
been intensively studied over the past decades, demonstrating inhibi- 
tion of IR kinase activity or activation of JNK, oxidative and ER stress. 
However, the concentrations required to achieve inhibitory effects 
are often orders of magnitude higher than those measured in plasma 
from patients with insulin resistance, and pharmacological agents are 
not generally specific for inflammatory pathways. Recent clinical trials 
examining IL-1B inhibition with canakinumab, despite causing large 
reductions in C-reactive protein and IL-6, did not reduce incidence 
of chronic hyperglycaemia in T2D™. Inhibiting the obesity-related 
kinases, IKKB and TBK1, with amlexanox slightly reduced glycaemia, 
albeit with a paradoxical increase in serum IL-6. Of note, insulin resist- 
ance can exist without any relevant adipose tissue inflammation; this is 
supported by lipodystrophy models in human and mice”. Moreover, 
rats rapidly develop WAT insulin resistance after just several days of 
high-fat feeding associated with liver and muscle lipid deposition™, 
whereas adipocyte death and macrophage infiltration are detectable 
only after four weeks” ”°, Similarly, healthy humans exhibit adipose 
tissue insulin resistance within hours after saturated fat loading without 
alteration of circulating anti- or pro-inflammatory markers”. 
Together, these findings indicate that metabolic changes leading 
to ectopic lipid accumulation are relatively early events in the patho- 
genesis of insulin resistance and T2D, whereas WAT inflammation with 
cytokine spillover represent chronic alterations that occur later, pro- 
moting progression to fasting and postprandial hyperglycaemia in 
conjunction with reduced B-cell function. Recent studies have also 
implicated other adipose-derived factors, for example, circulating 
exosomal miRNAs, which may contribute to gene expression in distant 
tissues and glucose tolerance, as demonstrated in mice lacking the 


miRNA-processing enzyme Dicer and in lipodystrophic humans”. 


Cerebral regulation of hepatic metabolism 


High energy requirements and limited energy storage capacity in the 
brain may explain why cerebral energy supply by glucose and ketones 
is completely dependent on the liver and, tosome extent, kidney, dur- 
ing starvation and nearly independent of direct endocrine regulation. 
Conversely, cerebral insulin action may affect appetite control, mood, 
cognitive function and possibly peripheral glucose metabolism”. 
In mice, insulin and leptin act directly on the hypothalamic arcuate 
nucleus to activate proopiomelanocortin and inhibit Agouti-related 
protein neurons, whereas adipostatic signals stimulate melanocortin 
4-expressing paraventricular neurons to induce satiety and energy 
expenditure”®. Hypothalamic inflammation, reflected by higher 
mediobasal hypothalamic gliosis in obese rodents and humans”, has 
been suggested to lead to chronic central insulin and leptin resistance, 
which would promote excessive food intake and bodyweight gain. In 
rodents, central insulin action lowered EGP, hepatic gluconeogenesis, 
WAT lipolysis and glucagon secretion, but increased muscle glucose 
uptake” (Fig. 3). However, carefully controlled studies failed to 
confirm similar brain insulin action to regulate hepatic glucose fluxes 
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Fig.3| A unified concept of insulin resistance in humans. a, Overnutrition 
leads to adipose tissue hypertrophy and hyperplasia and ectopic TAG 
deposition, mainly in muscle and liver—key features of insulin resistance. b, 
Adipose dysfunction, possibly due to local hypoxia ultimately resulting in 
apoptosis and cell death, recruits and transforms macrophages to release, for 
example, TNF and interleukins (IL-1f and IL-6). Local inflammatory reaction 
increases lipolysis directly or via inhibiting insulin signalling with subsequent 
release of NEFA and glycerol from WAT. Chronically, adipose dysfunction alters 
adipocytokine secretion favouring systemic low-grade inflammation.c, In 
liver, glycerol as substrate and NEFA-derived acetyl-CoA, allosterically 
activating pyruvate carboxylase flux (Vp), stimulate gluconeogenesis and 
fasting glucose production. NEFA and glycerol, as substrates of TAG 
accumulation, initiate NAFLD inthe absence of adequate mitochondrial 
functionand generate lipotoxic metabolites that inhibit insulin signalling. d, 
Hepatocellular insulin resistance not only chronically upregulates 


in awake dogs’**”*”, In humans, intranasal insulin application did not 
affect fasting EGP, slightly decreased hepatic fat and increased ATP 
content in glucose-tolerant individuals, but not in people with T2D"*. 
Similarly, K,;p-channel activation decreased EGP only in glucose-toler- 
ant humans””’“°, Some studies suggested that cerebral insulin action 
results in parasympathomimetic IL-6 secretion by Kupffer cells to 
inhibit hepatic gluconeogenesis’? All of these studies are limited 
by experimental conditions suchas application and dosing of insulin, 
spillover of intranasally delivered insulin into systemic circulation and 
suitable metabolic control. Nevertheless, the brain can be involved in 
other aspects of interorgan crosstalk, orchestrated by metabolites”, 
adipokines (leptin) or enteroendocrine circuits (suchas glucagon-like 
peptide 1, gastric inhibitor peptide, ghrelin, cholecystokinin or fibro- 
blast growth factor (FGF)-19). The hypoleptinaemia-mediated stimu- 
lation of the HPA axis with subsequent stimulation of WAT lipolysis” 
might be an example of howthe human brain could indirectly regulate 
hepatic gluconeogenesis and EGP during starvation”. 


Aunifying concept of the development of T2D 


Recent studies assessing rates of hepatic pyruvate carboxylase flux, pal- 
mitate turnover and hepatic acetyl-CoA content in an awake rat model 
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ln 
function ) 
gluconeogenesis, but also decreases insulin-stimulated net glycogen synthesis 
and glucose uptake, in turn raising postprandial glucose production.e, In 
muscle, increased NEFA availability, accelerated by—possibly inherited— 
inadequate mitochondrial fat oxidation, also favours lipid synthesis, inhibiting 
insulin-stimulated glucose transport and glycogen synthesis. This, combined 
with lower non-insulin-mediated glucose uptake due to sedentary lifestyle, 
contributes to the postprandial glucose rise. f, These different mechanisms, 
along with direct stimulation by nutrients and enteroendocrine signals (such 
as GLP-1and GIP) increase the insulin:glucagon secretion ratio, resulting in 
normoglycaemia at the expense of hyperinsulinaemia. g, Chronically, both 
acquired and inherited factors impair insulin secretion, with subsequent 
postprandial and fasting hyperglycaemia. The brain may also contribute to 
regulation of peripheral metabolism via afferent (for example, leptin from WAT 
(orange dashed line)) and efferent (for example, to liver (grey dashed line)) 
signalling. 


of T2D, revealed a key mechanism by which insulin acutely suppresses 
hepatic gluconeogenesis and how increased WAT lipolysis, owing to 
macrophage infiltration with localized inflammation, can increase 
rates of hepatic gluconeogenesis and cause fasting hyperglycaemia". 
During hyperinsulinaemic-normoglycaemic clamps, nondiabetic rats 
exhibited a sequence of events starting with a 90% fall in circulating 
NEFA and glycerol within 5 min, followed by a50% reduction in hepatic 
acetyl-CoA and 70% suppression of EGP within 10 min without affect- 
ing hepatic glycogen, circulating lactate or glucagon concentrations. 
Furthermore, insulin-mediated suppression of WAT lipolysis, lead- 
ing to reductions in hepatic acetyl-CoA could entirely explain acute 
insulin-induced suppression of hepatic gluconeogenesis. In line with 
these results, rodents lacking canonical hepatocellular insulin signal- 
ling (AKT1-, AKT2-, FOXOI1- or IR-antisense oligonucleotide treatment) 
showed intact insulin-mediated EGP suppression’****”*®, Together, 
these studies demonstrate that—in contrast to the effects of insulin 
to stimulate hepatic glycogen synthesis through direct stimulation 
of hepatic insulin signaling—insulin acutely suppresses hepatic glu- 
coneogenesis, mostly via an indirect mechanism through suppression 
of WAT lipolysis (Fig. 3). 

Conversely, rats ona four-week high-fat diet developed fasting hyper- 
glycaemia along with 25% higher rates of EGP, owing to increases in 


Nature | Vol576 | 5 December 2019 | 57 


Review 


hepatic pyruvate carboxylase flux and glycerol-to-glucose conversion™. 
During clamps, impaired EGP suppression, along with higher rates of 
pyruvate carboxylase flux, could be attributed to increased hepatic 
acetyl-CoA content resulting from greater WAT lipolysis. Inhibition of 
lipolysis in atglistatin-treated rat models of T2D or high-fat fed adipose 
triglyceride lipase (ATGL)-knockout mice reversed these abnormali- 
ties. Furthermore, this study depicted distinct time-dependent altera- 
tions in WAT biology, starting with adipocyte hypertrophy, followed by 
increased levels of macrophage-secreted granulocyte- macrophage 
colony-stimulating factor (GM-CSF) and IL-6 in plasma and WAT. 
Consistent with the potential role for localized macrophage-induced 
WAT lipolysis, IL-6 infusion stimulated, whereas anti-IL-6 treatment or 
macrophage-specific JNK knockout ameliorated WAT lipolysis, hepatic 
insulin resistance and gluconeogenesis. Translating these studies to 
humans, obese insulin-resistant adolescents also exhibited increased 
fasting EGP, impaired insulin-mediated suppression of lipolysis, EGP 
and macrophage infiltration, and 50% higher IL-6 concentrations in 
WAT“. Taken together, these studies indicate that macrophage-induced 
cytokine-mediated WAT lipolysis raises hepatic acetyl-CoA content and 
pyruvate carboxylase activity and flux, probably serving as a molecular 
mechanism linking WAT inflammation to both fasting and postpran- 
dial hyperglycaemia (Fig. 3). These data also challenge the canonical 
view of inflammation-mediated hepatic insulin resistance occurring 
through activation of the NF-KB-JNK-ceramide biosynthetic pathways 
and explain the relatively mild metabolic phenotype in rodents with 
abrogated hepatic insulin action. 

Nevertheless, hepatic insulin resistance, concomitant increases in 
EGP and hyperglycaemia in T2D are probably multifactorial in nature 
and not exclusively due to increased WAT lipolysis. Supporting this 
view, athree-day very low-calorie diet reversed hyperglycaemiaina rat 
model of uncontrolled T2D, not only via reductions in hepatic acetyl- 
CoA with lower rates of hepatic gluconeogenesis, but also via reduc- 
tions of hepatic DAG-PKCe-mediated hepatic insulin resistance and 
lower rates of hepatic glycogenolysis”. Of note, these effects occurred 
independently of any changes in hepatic ceramides, cytokines, plasma 
branched-chain amino acids, glucagon, corticosterone or FGF-21. 

Holistically, adaptation of metabolic fluxes during fasting and obe- 
sity-related diabetes represents the response of WAT to altered sub- 
strate supply, which would prevent distant insulin-dependent tissues 
from substrate oversupply and provide sufficient vital substrates to 
brain. We postulate that the biology of fasting and postprandial hyper- 
glycaemia depends on dysregulated WAT lipolysis (and possibly contri- 
butions from intrahepatic lipolysis) driving hepatic gluconeogenesis 
through allosteric hepatic acetyl-CoA activation and altered substrate 
signalling, preferably via the sn-1,2-DAG-nPKC pathway (Fig. 3). This 
concept also highlights important targets for future T2D treatment. 


Outlook 


The idea that metabolic flux adaptation in liver and, to some extent, 
skeletal muscle is largely orchestrated by WAT in health and disease is 
supported by aseries of studies in humans and model organisms. Nev- 
ertheless, several aspects still require confirmation both ona molecu- 
lar-cellular level and in humans under specific metabolic conditions. 
First, the initial events leading to adipocyte dysfunction and the factors 
responsible for ectopic lipid deposition in skeletal muscle are not fully 
understood in humans. Second, the subcellular distribution of differ- 
ent lipid mediators in various compartments and their interactions 
with nPKC activation and other downstream factors require additional 
translational studies. In this context, certain intracellular lipids may 
be linked with insulin sensitivity to varying degrees in sedentary and 
strongly in physically active humans“. Moreover, genetically modified 
animals suchas adipose- and liver-specific PKCe-knockout mice™, are 
helpful for exploring cellular pathways, but require detailed analy- 
sis of the experimental conditions and ultimately testing of their 
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relevance in humans. Third, the rapidly growing body of multi-omics 
data might contribute to a better understanding of the cooperative 
action of metabolites to modify flux rates and insulin signalling. Along 
these lines, the relevance of metagenomics and epigenomics for the 
initiation, amplification or reversal of insulin resistance in humans is 
still largely unclear, despite recently gained insights into the dynamic 
regulation of insulin sensitivity following metabolic surgery'*. Recent 
detection of different T2D phenotypes® ” will reinforce investigation 
of gene variants, metabolites and neuro-immune-endocrine signals for 
interorgan communication regulating insulin sensitivity’. We antici- 
pate that future studies will yield the mechanisms that underlie insulin 
resistance and B-cell dysfunction, which will guide precision medicine 
towards more effective treatments for T2D and related disorders such 
as NAFLD, including NASH and the metabolic syndrome. 
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The detection! of a dust disk around the white dwarf star G29-38 and transits from 


debris orbiting the white dwarf WD 1145+017 (ref. *) confirmed that the photospheric 
trace metals found in many white dwarfs? arise from the accretion of tidally disrupted 
planetesimals*. The composition of these planetesimals is similar to that of rocky 
bodies in the inner Solar System®. Gravitational scattering of planetesimals towards 
the white dwarf requires the presence of more massive bodies®, yet no planet has so 
far been detected at a white dwarf. Here we report optical spectroscopy of a hot 
(about 27,750 kelvin) white dwarf, WDJ0914.05.30+191412.25, that is accreting froma 
circumstellar gaseous disk composed of hydrogen, oxygen and sulfur at a rate of 
about 3.3 x 10’ grams per second. The composition of this disk is unlike all other 
known planetary debris around white dwarfs’, but resembles predictions for the 
makeup of deeper atmospheric layers of icy giant planets, with H,O and H,S being 
major constituents. A giant planet orbiting a hot white dwarf with a semi-major axis of 
around 15 solar radii will undergo substantial evaporation with expected mass loss 
rates comparable to the accretion rate that we observe onto the white dwarf. The orbit 
of the planet is most probably the result of gravitational interactions, indicating the 
presence of additional planets in the system. We infer an occurrence rate of 
approximately 1in 10,000 for spectroscopically detectable giant planets in close 
orbits around white dwarfs. 


WD J091405.30+191412.25 (henceforth WD JO914+1914) was ini- 
tially classified as a short-period interacting white dwarf binary on 
the basis of a weak Ha emission line detected in its spectrum 
obtained by the Sloan Digital Sky Survey (SDSS)®. Upon closer inspec- 
tion of this spectrum, we identified additional emission lines of 
oxygen (O Lat wavelengths 7,774 A and 8,446 A), and an emission 
line near 4,068 A that we tentatively identified as [S 11]. The line 
flux ratios of the hydrogen and oxygen lines are extremely atypical 
for any white dwarf binary, casting doubt on the published 
classification. 

We obtained deep spectroscopy of this star using the X-Shooter 
spectrograph on the Very Large Telescope of the European Southern 
Observatory (see Fig. 1), which confirms the presence of [S 11] (4,068 A), 
and contains additional emission lines of [O 1] (6,300 Aand 6,363 A) as 
well asa blend of O rand S I lines near 9,200 A. 

The double-peaked morphology of the Ha and the 01 (8,446 A) emis- 
sion lines (see Fig. 1) indicates an origin in a circumstellar gas disk’, 
reminiscent of several white dwarfs with dusty and gaseous planetary 
debris disks!°". However, the spectra of all known gaseous debris disks 
are dominated by the emission lines of the Ca II triplet (8,600 A), with 
weaker lines of Fe 11, which are absent in the X-Shooter observations 
of WD J0914+1914. Moreover, none of the other gaseous debris disks 
around white dwarfs show Ha emission. 


The X-shooter spectrum of WD JO914+1914 displays strong Balmer 
lines, implying ahydrogen-dominated atmosphere, as well as numer- 
ous sharp absorption lines of oxygen and sulfur (see Fig. 2). We deter- 
mined the white dwarf’s effective temperature of 7,,= 27,743 + 310 K 
and asurface gravity of log(g) = 7.85 + 0.06 from the well flux-calibrated 
SDSS spectra (see Extended Data Fig. 1 and Extended Data Table 1). 
Fixing these two atmospheric parameters, we measured the photo- 
spheric abundances of oxygen and sulfur, log(O/H) =-3.25 + 0.20 and 
log(S/O) =—4.15 + 0.20, and derived upper limits for twelve additional 
elements (see Fig. 3 and Extended Data Table 2). WD JO914+1914 is 
accreting at a rate of about 3.3 x 10° gs“, whichis among the highest 
rates observed for hydrogen-atmosphere white dwarfs polluted by plan- 
etary debris?. However, the measured accretion rate in WDJO914+1914 
includes only oxygen and sulfur, and the influxes of these two elements 
are an order of magnitude larger than in any other of these systems. 
If thermohaline mixing or convective overshoot are efficient in the 
atmosphere of WD J0914+1914, the accretion rate could be an order 
of magnitude higher”. 

We used the spectral synthesis code Cloudy” to model the photo- 
ionization of the circumstellar gas disk by the intense ultraviolet flux 
from the white dwarf (see Methods and Extended Data Figs. 2, 3). The 
emission lines, which are Doppler-broadened by the Keplerian rotation 
in the disk’, originate from a gaseous disk extending to approximately 
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Fig. 1| Emission lines from the circumstellar disk at WDJ0914+1914. The 
X-Shooter spectrum of WDJ0914+1914 (black) contains strong and broad 
emission lines of hydrogen, oxygen and sulfur. Ha (d) and 018,446 A(f)are 
double-peaked, indicating an origin ina circumstellar disk’. 017,774 A (e) and 
the oxygen and sulfur lines near 9,240 A (g) are multiplets, resulting in more 
complex line profiles. The forbidden sulfur and oxygen lines (a,c) havea 
smaller peak separation, indicating that they are emitted by material extending 


1-10R, (where R, is the radius of the Sun; see Extended Data Figs. 2, 4) 
from the white dwarf, at a density of p = 10"? g cm®. The relative abun- 
dances of oxygen and sulfur derived from this model, log(S/O) =-0.5, 
are consistent with those measured from the photospheric analysis. 
Hydrogen in the disk is strongly depleted with respect to oxygen and 
sulfur, log(O/H) = 0.29 and log(S/H) = -0.21. The non-detection of 
emission lines from other elements apart from hydrogen, oxygen and sul- 
fur allows stringent upper limits to be placed onthe abundances of sodium, 
silicon, calcium and iron inthe disk (see Fig. 3 and Extended Data Table 2). 

The abundances of the gaseous circumstellar disk, and of the trace 
metals in the photosphere of WD J0914+1914 are distinctly non-solar 
and inconsistent with accretion from the wind of alow-mass companion 
star“, Astellar companionis also ruled out by the stringent upper limits 
on the radial velocity variations of the white dwarf and the absence 
of an infrared excess (see Methods). In contrast to the white dwarfs 
known to be contaminated by planetary debris, the material accreted 
by WD J0914+1914 is extremely depleted in the major rock-forming 
elements magnesium, silicon, calcium and iron with respect to the bulk 
Earth, andthe circumstellar disk at WD JO914+1914 is much larger than 


to larger distances from the white dwarf compared to the other lines. The 
spectra of the gaseous planetary debris disks detected at several other white 
dwarfs” areall dominated by the Ca118,600 A triplet (f), with weak additional 
emission lines of oxygen (e, f) and iron (b), as illustrated by the spectrum of the 
prototypical system SDSS J1228+10408 (grey). The striking difference 
between the two spectra illustrates the different composition of the planetary 
material: gaseous in WDJ0914+1914 and rocky in SDSS J1228+1040. 


the canonical Roche radius for a rocky body”. Both facts argue against 
tidally disrupted planetesimals>” as the origin of either the gaseous 
disk, or the photospheric trace metals that we detected. Based on the 
observational evidence, WDJ0914+1914 is a white dwarf accreting from 
a purely gaseous circumstellar disk, and the most plausible origin of 
the material in that disk is an evaporating giant planet ona close-in 
orbit around the white dwarf. 

The abundances of WD J0914+1914 are reminiscent of the deeper 
layers of the ice giants in the Solar System. Modelling the radio and 
microwave spectrum of Uranus required lowconcentrations of ammonia 
(NH;) and large concentrations of H,O (ref. ”). Condensation of ammonia 
and hydrogen sulfide (H,S) into ammonium hydrosulfide (NH,SH) is 
potentially efficient at removing ammonia from the atmosphere. How- 
ever, fora solar sulfur-to-nitrogen ratio, there is insufficient sulfur to 
sequester all NH, into NH,SH. A plausible model for the spectrum of Ura- 
nus required H,O and H,S concentrations increased by factors of afew 
hundred with respect to their solar values”. H,S was recently detected 
in the atmospheres of Uranus’ and Neptune”, confirming that H,S ice 
is amajor constituent of the deeper cloud layers of icy giant planets. 
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Fig. 2| Photospheric oxygen and sulfur lines. The optical spectrum of WDJ0914 


+1914 contains strong photospheric lines of oxygen (a) and sulfur (b), indicating 


the ongoing accretion from the circumstellar gas disk. A spectral analysis of these lines results in log(S/O) = -0.9 (by number). 
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Fig. 3 | Abundances of the planetary material at WDJ0914+1914. Shown are 
the number abundances relative to oxygen, normalized to the corresponding 
ratio for the bulk Earth”? and sorted by condensation temperature. The error 
bars represent louncertainties. The only detected elements are hydrogen (in 
the circumstellar gas), oxygen and sulfur. Blue dots represent the abundances 
measured from the analysis of the white dwarf photosphere, red dots represent 
those derived from the Cloudy photo-ionization model for the circumstellar 
gas, and the respective upper limits are shown by downward arrows. Included 


Intense high-energy (extreme-ultraviolet, EUV) irradiation of 
Neptune-mass exo-planets results in the photo-evaporation of their 
atmospheres. Estimated mass loss rates of warm Neptunes with 
semi-major axes of a few solar radii reach 108-10" g s“ (for example, 
GJ 436b”’ and GJ 3470b”), comparable to the accretion rate we derive 
for WD J0914+1914. The high-energy stellar flux required for driving 
the mass loss rates of the known warm Neptunes is a few per cent of 
the total host star luminosity, compatible with the high-energy emis- 
sion of young stars”. Photo-evaporation is also the process most likely 
to cause the mass loss of the giant planet feeding WD JO914+1914. 
With the accretion disk extending out to around 10R,, the planet is 
probably located at approximately 15R, (see Methods). A large frac- 
tion of the luminosity of this moderately hot (7, = 27,750 K) white 
dwarf emerges in the EUV, which results in high-energy irradiation 
of the planet very similar to those of mass-losing warm Neptunes 
orbiting main-sequence stars. The atmospheric escape rate driven 
by the EUV flux of WDJ0914+1914 may beas high as about 5x 10"gs7 
(see Extended Data Fig. 5 and Methods), exceeding those of the warm 
Neptunes GJ 436b and GJ 3470b?0, 

A fraction of the material escaping the atmosphere of the planet 
remains gravitationally bound to the white dwarf, forming the cir- 
cumstellar disk detected in the double-peaked emission lines. From 
this reservoir, the material eventually accretes onto the white dwarf, 
resulting in photospheric oxygen and sulfur contamination. A photo- 
ionization model for the gaseous disk implies a strong depletion of 
hydrogen, whichis expected to be the dominant species in the planet’s 
atmosphere, within the circumstellar disk. In addition to its large EUV 
luminosity, the hot white dwarf also emits copious amounts of Lya 
photons, substantially exceeding the solar Lya flux (see Extended Data 
Fig. 6 and Methods). Consequently, the inflow of hydrogen is inhibited 
by its large cross-section in Lya, strongly enhancing the abundances of 
oxygen and sulfur in the circumstellar disk and in the accreted material. 


for comparison are the abundances of the Sun (long dashed lines), CI 
chondrites (short dashed lines), three white dwarfs accreting rocky debris” 
(triangles, which scatter closely around the bulk Earth abundances) and the 
one white dwarf accreting a Kuiper-belt-like object’ (squares, broadly 
resembling solar abundances). The material at WDJ0914+1914 is depleted by 
orders of magnitude in rock- and dust-forming elements (Si, Fe, Mg, Ca) with 
respect toall known minor planetary bodies and stars. 


A potential analogue to the planet at WD JO914+1914 is HAT-P-26b, 
a Neptune-mass planet” orbiting a K-type star with a period of 4.26 
days. The transmission spectrum of HAT-P-26b exhibits strong H,0 
absorption bands, with no detection of carbon-based species”. The 
carbon abundance” in the atmosphere of HAT-P-26b, log(C/O) < -2, 
is below our detection threshold (log(C/O) < -1.55; see Methods). A 
detection of carbon in the photospheric spectrum of WD J0914+1914 
will require either substantially deeper optical spectroscopy than our 
200-min-long X-Shooter observations or far-ultraviolet spectroscopy 
of the strong C 111,175 A transition. Modelling the spectrum of HAT-P- 
26b predicts sulfur-based cloud-forming condensates”, but these are 
not directly detected. Despite the high temperature of WDJ0914+1914, 
its small radius (0.015R,) implies aluminosity that is lower than that of 
F-, G- or K-type main-sequence host stars. Hence despite the intense 
EUV irradiation, a planet orbiting the white dwarf WD J0914+1914 will 
be cooler than an equivalent planet around a main-sequence star. 

Gravitational interactions in multi-planet systems can perturb plan- 
ets onto orbits with pericentres close to the white dwarf, where tidal 
effects are likely to lead to circularization of the orbit. Common enve- 
lope evolution provides an alternative scenario for bringing a planet 
into a close orbit around the white dwarf”, though it requires finely 
tuned initial conditions and only works for planets more massive than 
Jupiter (see Methods). As the white dwarf continues to cool, the mass 
loss rate will gradually decrease, and become undetectable in about 
350 million years (see Extended Data Fig. 8). By then, the giant planet 
will have lost around 0.002 Jupiter masses (or around 0.04 Neptune 
masses), that is, a very small fraction of its total mass. 

The ubiquitous existence of planets around white dwarfs has been 
indirectly implied by the frequent signatures of planetesimals scattered 
onto orbits crossing the Roche radii of white dwarfs, with dynami- 
cal preference for sub-Jovian mass planets®”°. We have inspected all 
7,000 or so white dwarfs” with SDSS spectroscopy, brighter thang=19 
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and hotter than 15,000 K for the presence of O01 (7,774 A and 8,446 A) 
emission lines, but did not identify another system that resembles WD 
JO914+1914. Spectroscopic signatures of giant planets at white dwarfs 
are therefore rare, but follow-up observations of the approximately 
260,000 white dwarfs identified with Gaia”’ have the potential to dis- 
cover a sufficient number of such systems to enable a comparative 
study of their atmospheric compositions. 
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Methods 


Discovery and follow-up observations 

Two SDSS spectra of WD JO914+1914 were taken in November 2005”° 
and March 2012” (see Extended Data Fig. 1and Extended Data Table 1), 
revealing the Ha, oxygen and sulfur emission lines. No noticeable 
change in the strength of the emission lines is detected between the 
two epochs. 

We observed WD J0914+1914 on 2019 January 12 and 13 using 
X-Shooter” mounted on UT2 of the Very Large Telescope. X-Shooter 
is athree-armed spectrograph covering the extreme blue (UVB, 330- 
560 nm), visual (VIS, 560 nm-1 pm) and near-infrared (NIR, 1-2.4 ym) 
simultaneously. We obtained ten spectra with 20-min exposure times 
each. Given the faintness of the star, z=19.9, little signal was expected 
inthe NIRarm, and we therefore used the ‘stare’ mode, that is, avoiding 
nodding. The data were reduced with the Reflex package adopting the 
standard settings and optimizing the slit integration limits”. Finally a 
weighted average spectrum was computed from the individual UVB and 
VIS observations. The signal-to-noise ratio of this average spectrum is 
about 45 and about 55 at 4,300 A and 7,000A, respectively. 

The X-Shooter spectrum contains the same emission lines detected 
in the SDSS spectra, plus several additional oxygen and sulfur lines 
(Fig. 1). The emission lines are broad and double-peaked, indicating 
that they originate in a circumstellar disk’**. Also present in the spec- 
trum are multiple strong photospheric absorption lines of oxygen and 
sulfur, implying ongoing accretion from the disk (Fig. 2). The detection 
of the emission lines in the 2019 X-Shooter spectra, and comparison 
with the 2005 SDSS spectrum places a lower limit of 14 years on the 
life-time of the disk. 


Stellar parameters of the white dwarf and its progenitor 

We measured the atmospheric parameters of WD JO914+1914 by 
fitting pure-hydrogen model spectra® to the two SDSS spectra, 
which are well flux-calibrated. We used the well established tech- 
nique of fitting the Stark-broadened Balmer line profiles®**””, which 
are sensitive to both temperature and gravity. The total extinc- 
tion along the line-of-sight towards WD JO914+1914 is low, E(B - 
V) = 0.0305 + 0.0006 (ref. °°), and normalizing the Balmer lines 
before the fit effectively removes the effect of extinction. The param- 
eters from the fits to the two SDSS spectra are consistent with each other 
within the uncertainties, and we take the variance-weighted average as 
the best-fit values (Extended Data Table 2). Using the cooling models of 
refs.” ”, wecomputed from the effective temperature, Tz=27,743 + 310 K 
and the surface gravity, log(g) = 7.85 + 0.06, a white dwarf mass of 
Myqa= (0.56 + 0.03) M, and a cooling age of 13.3 + 0.5 million years. The 
quoted uncertainties are only of statistical nature. The magnitude 
of additional systematic uncertainties can, in principle, be assessed 
from comparing the results from the spectroscopic fit to a joint 
analysis of the photometry and parallax of the star*?**. However, the 
large parallax uncertainty of WD JO914+1914 (about 22%) severely 
limits the precision of the atmospheric parameters derived from 
such an analysis”. The spectrophotometric distance implied by our 
fit is about 625 pc, consistent with the upper limit on the distance 
based on the Gaia parallax®’. As an alternative independent test of 
our spectroscopic fit, we applied an extinction of E(B - V) = 0.0305 
to the model spectrum with the atmospheric parameters given 
above, and then scaled that reddened model to the SDSS r-band 
magnitude. We computed a GALEX near-ultraviolet magnitude of 
18.07 from this model, which agrees well with the observed value 
of 18.06 + 0.03 (ref. *°). 

There is still quite some uncertainty in the low-mass end of the initial- 
to-final mass relation. Using two different relations results in progenitor 
masses of about 1.0M, (ref. *”) and about 1.6M, (ref. *). The larger value 
is in closer agreement with many of the earlier works on the initial-to- 
final mass relation’. The main-sequence lifetimes of stars in this mass 


range are about 2-10 billion years, that is, the white dwarf cooling age 
is negligible compared to the total system age. 


Photospheric abundances 

Fixing the atmospheric parameters as derived above, To¢= 27,743 K 
and logg = 7.85, we computed synthetic spectra® for a wide range of 
abundances of C, N, O, Ne, Na, Mg, Al, Si, P, S, Cl, Ar, K, Ca, Sc, Ti, V, Cr, 
Mn and Fe, and fitted those models to the average X-Shooter spectrum. 
The only elements detected in the photosphere are oxygen and sulfur at 
log(O/H) =-3.25 + 0.20 and log(S/H) =—4.15 + 0.20 (by number), imply- 
ing log(S/O) =-0.9, which is far above the solar value of -1.57, though 
still within the range of stars within the solar neighbourhood™. For all 
other elements, we derived upper limits (see Extended Data Table 2). 

Radiative levitation is negligible for oxygen and sulfur at the effective 
temperature of WD JO914+1914 (see figure 2 of ref. *), and therefore 
the large photospheric abundances of these elements imply ongoing 
accretion. Accounting for the diffusion velocities, the photospheric 
sulfurand oxygen abundances require accretion rates of M,=5.5*10°gs* 
and Mj=2.7 x 10° gs“, respectively. Several studies argue that the gra- 
dient of the mean molecular weight resulting from the accretion of 
metals into the radiative hydrogen atmospheres of warm white dwarfs 
drives thermohaline mixing”*>°°, which would cause the above rates 
to be underestimated. The most recent studies”°° extend only to Tog 
20,000 K, and we conclude that the combined accretion rate of oxygen 
and sulfur based on purely diffusive sedimentation provides a lower 
limit of M =3.3 x 10° gs“. The actual rate may be higher by an order of 
magnitude. 

The Ha emission line from the circumstellar disk suggests that 
hydrogen is also accreted onto the white dwarf. However, given that 
hydrogen is the dominant element in the atmosphere, we are not able 
to derive the hydrogen fraction in the accreted material. Consequently, 
the analysis of the photospheric spectrum does not providea constraint 
on the contribution of hydrogen to the total accretion rate from the 
circumstellar disk. 


Dynamical information on the location of the emitting gas 
The double-peaked structure of the emission lines arises from the 
Keplerian motion (v,) of gas in a disk around the white dwarf, with 


og= | ee (a) 


where Gis the gravitational constant, and rthe distance from the centre 
of the white dwarf. Hence, the morphology of the emission line profiles 
provides dynamical information on the location of the emitting gas, 
with the separation of the double peaks corresponding to emission 
from the outer edge of the disk, and the maximum velocity detected 
in the line wings corresponding to emission from the inner edge®**. 
Inspection of the normalized line profiles shows that the morpholo- 
gies of the individual lines are distinctly different (see Extended Data 
Fig. 2). In particular, the double-peak separation of the forbidden 
[S 11] lines is narrower than that of Ha and 018,446 A, which implies 
that the region emitting [S 11] extends to larger distances from the 
white dwarf. To estimate the velocity ranges over which the circumstel- 
lar gas contributes to the observed emission lines we measured the 
separation of the double peaks and the maximum extent of the line 
wings (full width at zero intensity) of Ha, 018,846 A and [S11] 4,068A 
(017,774 Aisa relatively widely spaced triplet, which results in more 
complex sub-structure of the line profile, and the [O 1] 6,300, 6,343A 
lines are affected by residuals from the oxygen night-sky airglow of the 
Earth’s atmosphere). Whereas the separation of the double-peaks shows 
a wide range of velocities (about 150 kms” for [S 11] 4,068 A, about 
260 kms“ for Ha and about 350 kms" for 018,446 A), all lines have 
similar maximum velocities, about 630-650 kms“, implying that they 
share a common inner radius in the disk. 
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Because the inclination of our line of sight against the accretion disk 
is unknown, the semi-major axes of the Keplerian orbits associated with 
the velocities measured from the emission lines span a wide range. 
Adopting a white dwarf mass of M,,4=0.56M,, the correspondence 
between inclination and semi-major axis is illustrated in the Extended 
Data Fig. 3. Inclinations i< 5° can be excluded because the orbits of the 
gas would fall inside the white dwarf. For an inclination of 90° (edge- 
on), the inner and outer radii of the gas disk are about 1R, and about 
10R,, respectively. 


A photo-ionization model for the accretion disk 

Given the mixture of ionization species seen in emission (HI, O1,S1, SII), 
the temperature of the circumstellar gas disk is expected to be in the 
range of about 5,000-10,000 K. Whereas mass transfer through the disk 
will result in some viscous dissipation, the accretion rate inferred from 
the photospheric oxygen and sulfur abundances (M = 3.3 x 10° gs") 
cannot provide sufficient heating. This problem has been explored and 
discussed in detail for the known gaseous debris disks around white 
dwarfs*”. Instead, photo-ionization by the intense ultraviolet flux from 
the white dwarf is extremely efficient at heating the upper layers of the 
disk***°. We used the photo-ionization code Cloudy” to develop asimple 
model that can provide insight into the geometry and the composition 
of the circumstellar gas at WD JO914+1914. 

Cloudy requires the spectral energy distribution, and luminosity of 
the ionizing source as inputs, for which we computed a white dwarf 
model spectrum spanning wavelengths from 10 A to 3 xm with the 
parameters in the Extended Data Table 1. We adopted the solar abun- 
dances for the base composition of the circumstellar gas as provided 
in solar_GASS10.abn® within the Cloudy distribution. 

The geometry of the irradiation of the disk by the white dwarf can 
broadly be separated into two regimes, depending on the ratio of the 
disk height to the radius of the white dwarf. The disk height is given by“ 


Teast 
= gas 2 
H | ke GM, (2) 


where kis the Boltzmann constant, 7,,, the temperature of the gas and 
pthe mean molecular weight of the gas. The mean molecular weight 
depends onthe abundances of the gas (primarily on the mass fractions 
of hydrogen, oxygen, and sulfur) and on the degree of ionization, but 
is not expected to vary much beyond p ~ (10-30)m,, where m, is the 
proton mass. The dominant factor inthe above expression is therefore 
the distance r from the white dwarf, which implies that the disk flares 
up «r??, 

Near the white dwarf, r<1R,, the disk height is small compared to 
the radius of the white dwarf, and the disk is illuminated from above. 
However, owing to the shallow angle, a, of the incident radiation, the 
effective path length through the gas is much larger than the actual 
disk height, H/sina. For distances larger than about 1R,, the height of 
the disk approaches, and eventually exceeds, the radius of the white 
dwarf, and the assumption of a gas shell illuminated by a point source 
becomes appropriate. We approximated the near case by a gas shell 
with a distancer from the white dwarf, anda thickness dr=H/sina, and 
the far case by a gas shell with a distance r from the white dwarf, and 
dras a free parameter. 

We computed an initial set of Cloudy models, exploring the following 
free parameters: r, the distance from the centre of the white dwarf, dr, 
the extent of the gas layer, p the density of the gas, and H/O, the number 
abundance of hydrogen relative to oxygen. In these initial models, we 
fixed log(S/O) =—0.9, as determined from the analysis of the white dwarf 
photospheric spectrum. No elements apart from hydrogen, oxygen 
and sulfur were included in the model at this stage. The primary input 
parameter for Cloudy is the hydrogen number density, N,,, which we 
computed for a given model from the gas density p, and the H/O and 
S/O abundance ratios. 


The ultraviolet radiation from the white dwarf photo-ionizes the 
upper layers of the circumstellar disk, heating it to about 10,000- 
20,000 K. These layers are optically thin in the continuum, and the 
cooling of the gas takes place via the emission lines detected in the 
optical spectrum of WD JO914+1914. Deeper layers are essentially 
neutral, and the observed emission line spectrum does not providea 
constraint on the total column density of this neutral material. Within 
reasonable limits, o and drcan be traded off against each other, as both 
parameters determine the total column density of the gas, and hence 
the total cross-section for intercepting the ultraviolet photons from 
the white dwarf. 

To assess the quality of the Cloudy models, we computed line flux 
ratios for all observed emission lines, and compared the values from 
the synthetic spectrum with those measured from the X-Shooter data: 


Niines Nines FR/FS FS/F? 
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where F° and F refer to the observed and synthetic line fluxes, respec- 
tively. The above function equally penalizes models in which the line 
fluxes are either too large, or too low. 

From the first exploratory models we found that for a solar O/H 
ratio, the Balmer lines were always much stronger than observed, 
independent of the exact choice of r, dr, and p. Depleting log(O/H) 
= 0.29 resulted in model line flux ratios that were within the correct 
order of magnitude. At close separations from the white dwarf, low 
densities (9 <10™ g cm *) are insufficient to cool the gas efficiently, and 
the resulting line flux ratios are incompatible with the observations. 
For higher densities, cooling becomes more efficient, and the deeper 
layers are sufficiently cool to produce substantial emission in the OI 
lines. However, the synthetic spectra contain a number of strong lines 
that are not observed (013,946 A, 0 114,650 A and $14,590 A), and fail 
to reproduce the line strengths of the observed forbidden lines ([O 1], 
[S11]). In conclusion, this first sequence of models indicated that hydrogen 
is strongly depleted in the disk, and that geometries corresponding to 
very lowinclinations (i<20°; see Extended Data Fig. 3) that would result 
ininner disk radii with r«1R, are incompatible with the observations. 

To find the parameter space that best reproduces the observed line 
flux ratios we proceeded to compute a grid of Cloudy models witha 
fixed dr=0.3R,, sampling 0.1R, <rs10R, (constrained by the widths of 
the observed lines; see Extended Data Fig. 3),10’ gcm™?<p<10*gcm* 
and log(O/H) =—0.11 to 0.89 and log(S/O) =—1.77 to 0.23. The quality of 
the models in the (r, p) plane (Extended Data Fig. 4) illustrates that the 
best match to the observed line flux ratios is achieved for a location of 
the gas at 1R, sr<4R,, and a density of p~10"* g cm™*. The synthetic 
spectra in this parameter range produce line flux ratios that are typi- 
cally consistent with the observed values within a factor of around 2, 
and do not result in emission lines that are not detected. Combining 
the constraints from the Cloudy models with those derived from the 
profile morphology of the observed emission lines (Extended Data 
Figs. 2, 3) suggests an inclination of the disk iz 50°. 


Abundances of the circumstellar disk 

The best Cloudy models are found for log(O/H) = 0.29 and 
log(S/O) = —0.5, with uncertainties of 0.3 dex. For comparison, we 
derived log(S/O) =-—0.9 from the photospheric analysis. Both measure- 
ments agree within a factor of around 2.5. This is the first instance where 
the composition of the accreted material is consistently determined 
by two independent measurements, thatis, from the absorption lines 
within the white dwarf atmosphere, and from the emission lines of the 
circumstellar gas reservoir. 

The fact that the X-Shooter spectrum contains only emission lines of 
hydrogen, oxygen and sulfur provides upper limits on the abundances 
of other elements within the circumstellar gas disk that are typically 
found in white dwarfs accreting planetary debris. Fixing r=2.5R, and 


p=10"’g cm °, we proceeded to add additional elements into the disk 
model, with their initial abundance set to its solar value. The resulting 
Cloudy spectra predict strong emission lines for C, N, Na, Mg, Al, Si, K, 
Ca and Fe. We re-computed Cloudy models, reducing the abundances 
until the line strengths in the models were consistent with the non- 
detection in the X-Shooter spectrum. The upper limits on the abun- 
dances of these elements within the circumstellar gas disk are reported 
in Extended Data Table 2. Figure 3 illustrates that these upper limits 
are much more stringent for Na, Si, Fe and Ca compared to the limits 
obtained from the white dwarf photosphere analysis. 


Emission line profiles from a Keplerian disk 

The Cloudy model only takes into account the integrated line fluxes. In 
order to explore how well this model can also reproduce the observed 
emission line profiles we convolved the Cloudy spectrum from the 
computed grid that resulted in the best quality (see Eq. (3)), corre- 
sponding tor, =1.89R,, dr=0.3R,,p=10"* gcm >, log(S/O) =—0.5, and 
log(H/O) = —0.29, with the line profiles of a Keplerian disk. As, at this 
stage, we are interested in the shape of the line profiles, we normalized 
the line fluxes of the Cloudy model tothose measured from the X-Shooter 
spectrum, effectively removing the small remaining differences (about 
afactor of two, see above) inthe absolute line fluxes. We used analytical 
expressions for the Abel transform™, a power-law index of zero for the 
radial intensity distribution, and allowed the inner and outer radii of 
the disk to vary in order to match the observed emission line profiles. 
Adopting an inclination of the gaseous disk against the line of sight of 
i= 60°, the line widths and separations of the double peaks of Ha and 
018,446 Aare well matched (Extended Data Fig. 2) by inner disk radii 
of r,,  -(1.0-1.3)R, and outer radii of 7, = (3.0-3.3)R,. The more com- 
plex structure of the 017,774 A multiplet is also reasonably well repro- 
duced by the same range of r,, and F,,,. In contrast, [S11] 4,068 A requires 
fin® (1.0-1.3)R, and r,.~ (8-10)R,, and the two forbidden [O 1] lines also 
imply similarly large outer radii, even if their double peaks are not well 
resolved due to the residuals of the sky background subtraction. The 
larger outer disk radii implied by the line profiles of the forbidden lines 
confirm the simple estimates we made above (see Extended Data Fig. 3). 

Whereas the synthetic line profiles of an axially symmetric disk 
reproduce the X-Shooter data relatively well, there is a noticeable dif- 
ference in the shape of the central depression of 018,446 A, with the 
observations showing a deeper V-shape compared to the U-shape of 
the model line profile. Similar differences have been observed in the 
Balmer lines from accretion disks in white dwarf binaries, and have 
beeninterpreted as optical depth effects”. We also note that matching 
the observed width of the Ha double peaks requires a small amount 
of additional intrinsic broadening, which could be the result of Stark 
broadening within the disk. 

We conclude that despite our model for the circumstellar gas disk 
being relatively simple (based on a constant density both in radius and 
vertical extent of the disk), the overall agreement in both the emerging 
fluxes and the profile morphology of the emission lines is remarkably 
good, resulting in a consistent set of parameters both in terms of the 
geometric location of the gas, and its composition. The reality will 
have amore complex geometry as well as density gradients. However, 
including that complexity in the model by introducing additional free 
parameters is unlikely to provide deeper physical insight. 


Ruling out astellar /sub-stellar companion 

The initial classification of WD JO914+1914 suggested it to be a cata- 
clysmic variable, that is, ashort-period binary containing a white dwarf 
accreting froma Roche-lobe filling low-mass companion. Whereas the 
double-peaked morphology of the emission lines confirms the pres- 
ence of a circumstellar gas disk, cataclysmic variables typically have 
much stronger Balmer (and often helium) lines ®, and no example 
of acataclysmic variable with a white dwarf as hot as about 28,000 K 
dominating the optical spectrum is known®*, 


Another class of systems with similar spectroscopic appearance as 
WD J0914+1914 are detached short-period post-common envelope 
binaries (PCEBs), that is, white dwarf binaries with low-mass com- 
panions, where Ha emission from the companion star is commonly 
detected”. In PCEBs containing hot white dwarfs, emission lines of 
calcium and iron originate from the intense irradiation of the compan- 
ion’””?, which are not observed in WD JO914+1914. The emission lines 
in PCEBs are narrow and single-peaked, and trace the orbital motion 
ofthe companion star, with typical periods of hours and radial velocity 
amplitudes of several 100 kms (refs. “”). The double-peaked shape 
of the emission lines in WDJ0914+1914 already rules out an origin from 
an irradiated low-mass companion. Moreover, their velocity variation 
is less than about 20 kms, much lower than observed in any of the 
known PCEBs”. 

We measured the radial velocity of the white dwarf using ten of the 
strongest sulfur absorption lines in the X-Shooter UVB spectra. We 
fixed the relative wavelengths of these lines to their laboratory values, 
and their width to1A, roughly matching the spectral resolving power, 
leaving only the depths of the lines, and the white dwarf radial velocity 
as free parameters. We find a mean white dwarf velocity of -47 kms™ 
and an average statistical uncertainty of the individual velocity meas- 
urements of about 4.0 kms“. In addition, there is a systematic uncer- 
tainty arising from imperfections in centring the star in the slit and the 
instrument model accounting for flexure. We measured this systematic 
uncertainty from the interstellar Ca K line to be about 3.7 kms“, and 
added it in quadrature to the statistical uncertainties, resulting ina 
total uncertainty of the individual radial velocity measurements of 
about 5.5 kms“. The radial velocities of WD JO914+1914 are consistent 
with a constant value, that is, the reduced x’ with respect to the mean 
is co = 0.95. We conclude that we do not detect a radial velocity vari- 
ation of the white dwarf, with an upper limit onits radial velocity ampli- 
tude of K,,, <3 kms“. For the typical periods of PCEBs, about 2 hto 
1 day”, brown dwarf companionsare ruled out. Inthe period range for 
the mass donating object suggested by our analysis (see below), about 
8-10 days, companions with M 2 30M,,, are ruled out. 

Furthermore, a stellar companion would result in an infrared excess 
with respect to anisolated white dwarf. The location of WDJ0914+1914 
has been covered by the UKIRT Hemisphere Survey” in the/-band. WD 
JO914+1914 is not detected at the/ = 19.6 (So) magnitude limit of the 
UKIRT Hemisphere Survey. The white dwarf alone has/= 19.65, com- 
puted from the synthetic spectrum. Using absolute /-band magnitudes 
of M-dwarfs and L-type brown dwarfs”, anda conservative upper limit 
onthe distance of d= 631 pc*, the non-detection of WD J0914+1914 in 
the UKIRT Hemisphere Survey excludes the presence of acompanion 
earlier than an L5 brown dwarf. 

The forbidden oxygen and sulfur lines detected in the spectrum of 
WD J0914+1914 have not been observed in any accreting or detached 
white dwarf binary. Accretion from the wind of alow-mass companion 
does result in photospheric metal contamination in these binaries”, 
however, their abundances derived from spectroscopic analysis are 
broadly consistent with solar abundances of the accreted material, with 
strong absorption lines of calcium, iron, magnesium and silicon"***°, 

We conclude from the analysis of the observations that WD 
JO914+1914 is a white dwarf accreting from a circumstellar gas disk 
with extremely non-solar abundances, and that the origin of the cir- 
cumstellar disk is not a stellar or brown-dwarf companion. 


Photo-evaporation versus Roche-lobe overflow 

The disk size provides constraints on the location of the planet. We 
assume that the outer radius of disk is traced by the forbidden [S 11] and 
[Ot] lines, r,,,~10R,, and that this radius corresponds to the maximum 
size of the accretion disk allowed by tidal forces, which is approxi- 
mately 90% of the white dwarf’s Roche lobe, that is, rou: = O-9Riwa” 
For a given mass of the planet, assuming a circular orbit, and using 
standard formula for the Roche-lobe radius®, this expression allows 
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an estimate of semi-major axis of the planet’s orbit. For Neptune- 
to Jupiter-mass giant planets this implies semi-major axes of about 
(14-16)R, and orbital periods of 8-10 days. We envisage two scenarios 
in which WDJ0914+1914 could accrete from a planet ona close orbit, 
that is, either via mass loss driven by the intense EUV luminosity of 
the white dwarf, or via Roche-lobe overflow. Both alternatives are 
discussed in detail below. 


Photo-evaporation 

EUV radiation is known to drive atmospheric mass loss from giant plan- 
ets in close orbits around their host stars. This hydrodynamic escape 
is the result of the ionization of hydrogen. In the absence of efficient 
cooling mechanisms, no hydrostatic solution exists for the atmosphere 
ofa planet subject to intense irradiation, leading to the formation ofa 
trans-sonic flow’. Drag forces in this outflow cause heavier elements to 
be carried with the escaping hydrogen. The detection of Lya absorption 
from atomic hydrogen located outside the Roche lobe of the transiting 
planet HD 209458b clearly demonstrated the escape of atmospheric 
material from the planet, and provided the first direct evidence for the 
evaporation of exo-planets®. Subsequent Lya transits were detected 
inanumber of other systems, including the hot Jupiter HD 189733b** 
and the close-in Neptune-mass planet GJ436b7>*°, 

In addition to these Lya transit observations, heavier elements inthe 
extended atmospheres of transiting planets were detected in ultravio- 
let and X-ray transit spectroscopy” *’, showing that the atmospheric 
escape must be driven by a hydrodynamic process. Apart from the 
observational detection in a number of individual systems, hydrody- 
namic escape is thought to play a crucial role in shaping the proper- 
ties of the population of close-in exoplanets®, resulting in the nearly 
complete absence of Neptune-mass planets with orbital periods of a 
few days (the warm Neptune desert) as well as the dearth of low-mass 
planets with 1.5-2 Earth radii (the evaporation valley). 

Totest the plausibility of hydrodynamic escape for the planet around 
WDJ0914+1914 we determined the EUV flux of the white dwarf and esti- 
mated the corresponding evaporation rates using scaling laws derived 
from detailed hydrodynamic models®”. The incident EUV flux at the 
position of the planet was obtained by integrating white dwarf model 
spectra® from 10 Ato 912A. 

Trace metals inthe photosphere of WDJ0914+1914, resulting from the 
accretion of planetary material, may block some of the EUV emission”. 
To evaluate the importance of EUV line blanketing, we computed three 
white dwarf models, fixing the effective temperature and surface gravity 
tothe values determined from the photospheric analysis (Extended Data 
Table 1): (1) a pure-hydrogen model, (2) ahydrogen model with oxygen 
and sulfur at the photospheric abundances (Extended Data Table 2), 
and (3) a hydrogen model including in addition C, N, Na, Mg, Al, Si, P, 
Cl, Ar, K, Ca, Ti, V, Mn, Fe using the lower of the two upper limits on their 
abundances (photospheric or disk, Extended Data Table 2) and solar 
abundances for those elements without meaningful upper limits. We find 
very small variations of the EUV flux (less than 10%) between the three 
models, that is, the amount of metal pollution is insufficient to cause 
much line blanketing. Below, we use model (2), including photospheric 
sulfur and oxygen. The EUV flux incident upon the planet is shown as a 
function of orbital separation in the upper panel of Extended Data Fig. 5. 

The EUV luminosity of WDJ0914+1914 is comparable to that of T Tauri 
stars which are assumed to efficiently evaporate the atmospheres of 
their young giant planets. In particular, the atmospheres of Neptunes 
at separations below 0.1 astronomical units (roughly the outer border 
of the warm Neptune desert) are supposed to lose large parts of their 
atmospheres during these early stages. Analogously, the large EUV 
luminosity of WDJ0914+1914 implies that hydrodynamic escape is 
unavoidable for any planet with a hydrogen-rich atmosphere and a 
semi-major axis less than 200R,. 

For large EUV fluxes, hydrodynamic escape can be in the energy- 
limited or the recombination-limited regime. Hydrodynamic mass 


loss scales proportional to the EUV irradiation in the energy-limited 
regime, and scales with the square root of the EUV irradiation in the 
recombination-limited regime. 

For aJupiter-mass planet, the transition between both regimes is usu- 
ally assumed” to occur at 10,000 erg cm”’s ‘but can vary depending on 
the mass and the radius of the planet across a wide range of EUV fluxes”, 
about 1,000-100,000 erg cm’ ss“. Given that we currently do not know 
the mass and radius of the planet at WDJ0914+1914, we assume 10,000 
erg cm~s™ for the transition. Consequently, the mass loss rates we 
calculate below should be considered as an order-of-magnitude esti- 
mate. For the mass loss rate in the energy-limited regime of irradiated 
giant planets we used”: 
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where R, and M, are the radius and the mass of the planet, F,yy is the 
incident EUV flux, and ¢ is the efficiency of using the incident energy, 
which we set to €= 0.3 (refs. °°”). At close orbital separations, where 
the Roche lobe (R,p) and planet radius become comparable, mass loss 
is enhanced. This is accounted for by the correction term K(€=R, p/Rp), 
for which we used equation (17) from ref. °°. The mass loss rate driven 
by the strong EUVirradiation from WDJ0914+1914 is shown in the bot- 
tom panel of Extended Data Fig. 5. The K-term is responsible for the 
steep increase of M towards the smallest separations. For the estimated 
location of the planet (around (14-16)R,, grey-shaded region) we obtain 
amass loss rate of around 5 x 10" g s+, depending only weakly on the 
planet mass. At that distance from the planet, the outflow velocities 
that are required to reach the Roche lobe of the planet are far smaller 
than the velocity required to escape the gravitational potential of the 
white dwarf, and consequently the evaporated material will fall towards 
the white dwarf. 

Hydrogen is probably the dominant species in the planet’s atmos- 
phere, driving the hydrodynamic escape, and is hence also expected 
to be the most abundant element in the circumstellar disk at WD 
JO914+1914. However, the weakness of Ha in the X-Shooter spectrum, 
compared to the emission lines of oxygen and sulfur already suggests 
a substantial depletion of hydrogen in the disk with respect to solar 
abundances, which we quantitatively confirmed with the Cloudy photo- 
ionization models (log(O/H) = 0.29, compared to the solar ratio of 
-3.31). 

Motivated by these results, we explored the effect of forces other 
than gravity upon the material escaping the atmosphere of the planet. 
Being a hot white dwarf, WD J0914+1914 is not only bright in the EUV 
but also in the far-ultraviolet region of the spectrum, where radiation 
pressure from Lya photons can transfer momentum to the evaporated 
hydrogen. This effect is well studied in the Solar System, where the 
radiation pressure acting upon neutral hydrogen atoms within the 
heliosphere is proportional to the total flux in the solar Lya emission 
line. The relative importance of radiation pressure is usually expressed 
as , the ratio between the force related to radiation pressure and grav- 
ity, whichis very accurately known for the Sun. The effect of Lya radia- 
tion pressure on the motion of interplanetary neutral hydrogen has 
been measured™ and depending on the solar cycle y varies between 
about 0.8 during the minimum of solar activity and about 1.8 during 
the maximum”. Heavier species than hydrogen are much less affected 
by radiation pressure. 

To establish the importance of radiation pressure on the material 
accreting onto WDJ0914+1914 we compared the Lya flux of the white 
dwarf with that of the Sun. For that purpose, we retrieved the high- 
resolution far-ultraviolet spectra of the Sun obtained with the SORCE 
SOLSTICE instrument”. Despite the fact that the white dwarf spectrum 
shows Lya in absorption, whereas it is in emission in the spectrum of 
the Sun, the flux in the very core of Lya of WD J0914+1914 is compara- 
ble to that of the Sun (Extended Data Fig. 7), a simple consequence of 


the high effective temperature of the white dwarf. Moreover, the flux 
in WDJ0914+1914 rapidly increases outside the core of Lya, whereas 
it drops in the Sun, and as M,4< Mo, 4 > 1. We conclude that this pro- 
vides a natural explanation for the low abundance of hydrogen in the 
circumstellar disk at WD JO914+1914: the strong radiation pressure 
from the Lya photons of WDJ0914+1914 efficiently inhibits the inflow 
of hydrogen. This radiation-pressure-driven hydrogen depletion of 
the material flowing towards the white dwarf results in an accretion 
rate onto WDJ0914+1914 that is much smaller than the estimated mass 
loss rate (about 5 x 10" g s 1; see Extended Data Fig. 5). Given that the 
mass loss rate is an order-of-magnitude estimate only, we conclude 
that hydrodynamic escape and subsequent accretion of the heavier 
elements that are dragged by the escaping hydrogen, provides a con- 
sistent explanation of our observations. 


Roche-lobe overflow 

An alternative possibility for accretion from a giant planet onto WD 
JO914+1914 is Roche-lobe overflow which can be substantially increased 
by tidal heating”’”. However, the scenario of Roche-lobe overflow 
appears to be extremely unlikely for WD JO914+1914 for several rea- 
sons. Crucially, the observed emission lines are best reproduced bya 
circumstellar disk extending up to about 10R,, which implies that the 
planet must be located at >10R,, from WD JO914+1914. Even a Jupiter 
mass planet would have to be substantially inflated (to about 8 Jupiter 
radii) to fill its Roche lobe at sucha large orbital separation. More gener- 
ally, using an empirical mass-radius relation for giant planets”, we find 
that Roche-lobe overflow should occur at separations of (1-2)R,, clearly 
incompatible with the derived disk size. Furthermore, the mass transfer 
rates expected from a Roche-lobe overflow configuration exceed the 
value we derived from the photospheric analysis by several orders of 
magnitude. We conclude that Roche-lobe overflow is incompatible 
with the observational characteristics of WD JO914+1914. 


Common envelope evolution versus planet-planet scattering 

Whereas the observational evidence for a giant planet in a close-in 
orbit around WDJ0914+1914 is compelling, it is clear that a planet with 
an initial semi-major axis of a few tens of solar radii would not have 
survived the red giant branch evolution of the white dwarf progeni- 
tor. The physical mechanism that migrated the planet from several 
astronomical units onto its current orbit is open to some speculation. 

One possibility is common envelope evolution. At the onset of a 
common envelope, dynamically unstable mass transfer starts from 
the giant star onto the secondary object, in our case the planet. The 
timescale for this unstable mass transfer quickly becomes shorter 
than the thermal timescale of the planet and, as a result, acommon 
envelope forms around the planet and the core of the giant star, the 
future white dwarf. This common envelope is expelled at the expense 
of orbital energy, that is, the planet spirals inward. 

Common envelope evolution is known to produce binaries con- 
taining white dwarfs with stellar” and sub-stellar””’ companions and 
orbital periods inthe range of hours to days. In fact, common envelope 
evolution involving planetary mass objects has been suggested as a 
possible scenario for the formation of low-mass white dwarfs without 
a detectable stellar companion”. As the mass of the planet, M,, is much 
smaller than the white dwarf mass, the final separation after common 
envelope evolution can be written as”: 
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where R, is the radius of the giant star at the beginning of the inspiral 
phase, a, isthe common envelope efficiency, A is the binding energy 
parameter, and Mis the mass of the giant star, which can be separated 
into the core mass (mass of the future white dwarf, M,,,.) and the enve- 
lope mass (M,,,). The latter is going to be expelled during the process. 


During common envelope evolution, the planet will move inside the 
envelope of the giant star and is likely to be completely evaporated. 
Whether this happens, and at what separation, depends on the tempera- 
ture structure of the giant star envelope which can be approximated by” 

T= 1.78 x 10° x (r/R) °K (6) 
The radius at which evaporation of the planet occurs can then be esti- 
mated by equating the local sound speed inthe envelope and the escape 
velocity of the planet’™. 

The above approach has previously been used to estimate the out- 
come of common envelope evolution involving planetary mass com- 
panions using constant values for a; and A (ref. >). Throughout the 
last two decades, however, new constraints on the common envelope 
efficiency a, have been obtained and algorithms have been devel- 
oped that calculate the binding energy parameter A, which has been 
found to depend sensitively on the mass and radius of the giant star, in 
particular if recombination energy stored in the envelope is assumed 
to contribute to expelling the envelope’. The contributions from 
recombination energy are usually parameterized with a second effi- 
ciency parameter ,.¢. 

We calculated the possible outcome of common envelope evolu- 
tion involving a planetary mass companion taking into account these 
recent developments. We used the BSE code’”’ to compute the evolu- 
tion of main sequence stars in the range (1-8)M, and determined the 
binding energy parameter for all core masses close to the mass of WD 
J0914+1914, that is, we accepted all masses in the range (0.55-0.57)M,. 
We then used Eq. (5) to determine the final separation for planet masses 
ranging from super-Earths to the brown dwarf limit. For the planet to 
survive, the final separation must be sufficiently large that the planet 
does not evaporate in the red giant envelope, and does not fill its Roche- 
radius. The latter was calculated from the planet and white dwarf mass 
and assuming an empirical mass-radius relation for giant planets”®. 

The results derived from these calculations are illustrated in Extended 
Data Fig. 7 for two different values of the common envelope parameters. 
First, we used the strict upper limit for the contributions from orbital 
energy and recombination energy, that is, we assumed that both ener- 
gies fully contribute to expelling the envelope (a¢; = @,.. = 1.0). These 
calculations provide a stringent upper limit for the final separation 
(shown as the dashed line in Extended Data Fig. 7). More realistic are 
smaller values for both efficiencies, for example, observations of white 
dwarf binaries with M-dwarf stellar companions can be reproduced if 
Ace = Qrec = 0.25 (ref. 1), for which the predicted final separations fall 
below the solid black line in Extended Data Fig. 7. 

The most important conclusion drawn from inspection of Extended 
Data Fig. 7 is that planets with masses smaller than around 1M,,, cannot 
survive common envelope evolution, whereas planets in the mass range 
of about (1-13)M,,, could end up with orbital separations consistent 
with the estimated location of the planet around WD JO914+1914 at 
(14-16)R,. In the latter case the initial planet-star separation must have 
been about 1.5-5 astronomical units (depending on the planet mass) at 
the onset of mass transfer from the giant star onto the planet, when the 
giant star was close to the end of its AGB evolution, the binding energy 
of envelope was smallest and the contributions from recombination 
energy were largest. Planet population synthesis models predict the 
fraction of giant planets to increase with stellar mass in agreement with 
recent observational studies. Most of the white dwarfs in the Galaxy 
descend from A/F-type stars, and hence their progenitors are likely to 
have had rich planetary systems. Given that WD J0914+1914 is unique 
among about 7,000 white dwarfs with similar cooling ages observed 
by the SDSS, common envelope evolution can plausibly explain the 
close-in orbit of the planet at WDJ0914+1914, but requires it to be more 
massive than Jupiter. 

An alternative scenario explaining the existence of a giant planet 
inaclose-in orbit around WDJ0914+1914 is planet-planet scattering. 
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Dynamical studies have shown that closely packed planetary systems 
which remain stable and ordered on the main sequence can become 
unpacked when the star evolves into a white dwarf’°. As a consequence 
of this unpacking, inward incursions of planets can occur throughout 
the entire white dwarf cooling track for basically all types of planetary 
masses, ranging from Earth-like objects to giant planets. These inward 
incursions of planets on largely eccentric orbits will generate strong tidal 
forces that can circularize the planetary orbit. Planet-planet scattering 
therefore represents an alternative explanation for the close planet 
being evaporated by WDJ0914+1914, and also works for planet masses 
lower than the limit for common envelope evolution (about 1M,,,). 

The large abundance of sulfur in the circumstellar disk at WD 
JO914+1914 might indicate a planetary mass closer to those of 
Neptune and Uranus because the fraction of heavier elements is 
thought to increase with decreasing planetary mass’”, which would 
point towards planet-planet scattering causing the inward migration 
of the planet at WDJ0914+1914. However, given the high mass loss rates 
expected from hydrodynamic escape, we cannot exclude a more mas- 
sive planet. We therefore conclude that given the currently available 
observational constraints, both planet-planet scattering as well as 
common envelope evolution are plausible explanations for the exist- 
ence of the planet in close orbit around WDJ0914+1914. 

Additional constraints on the composition of the accreted material 
from ultraviolet spectroscopy of WD J0914+1914, as well as including 
tidal effects into N-body simulations of the evolution of planetary sys- 
tems around white dwarfs, have the potential to distinguish between 
the two scenarios. 


The past and future of the planet around WDJ0914+1914 

As the evolution of white dwarfs is relatively well understood and pri- 
marily consists of thermal heat loss through the non-degenerate enve- 
lope, and the consequent contraction of this envelope’, we can predict 
the incident flux, and with it the evaporation rate of the planet at WD 
JO914+1914, andthe resulting accretion rate onto the white dwarf, asa 
function of time. To that end we computed a small grid of white dwarf 
model spectra covering effective temperatures ranging from 80,000 
Kto10,000K, for the surface gravities corresponding to M,,=0.56M, 
at each temperature. Integrating the EUV fluxes of these model spec- 
tra, we then used Eq. (4) to estimate the mass loss rate as a function of 
effective temperature and cooling age (see Extended Data Fig. 8). As 
expected, the mass loss rate decreases with time, particularly once the 
incident flux onthe planet drops below10,000 erg cms, when mass 
loss becomes directly proportional to the EUV flux. We estimate that 
accretion of the evaporating material will become undetectable via 
photospheric metal contamination’? once the white dwarf has cooled to 
about 12,000 K, corresponding to a cooling age of around 350 million 
years, when the mass loss rate drops below 10° g s7. 

We estimate the total mass loss due to evaporation of the planetary 
atmosphere by integrating the mass loss rate over the cooling age of 
the white dwarf, and assuming that the planet reached its current orbit 
soon after the formation of the white dwarf. The resulting total mass 
loss is about 0.002 Jupiter masses, or about 0.04 Neptune masses. 
Thus, hydrodynamic escape will not change the structure of the giant 
planet around WDJ0914+1914. 


Data availability 


The SDSS and X-Shooter spectra analysed in this paper are available 
from the SDSS (https://www.sdss.org/) and ESO (http://archive.eso. 
org) archives. 


Code availability 


Cloudy is publicly available (https://www.nublado.org/). The model 
atmosphere code of D. Koester is subject to restricted availability. 
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Extended Data Fig. 1| Identification spectrum of WDJ0914+1914. The detected, S [11] 4,068 A anda blend of S1and OI near 9,240 Aare present near 
unusual nature of WDJ0914+1914 was identified fromits optical spectrum the noise level. 
within SDSS Data Release 14. The Ha, 017,774 Aand O18,446A lines are clearly 
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Extended Data Fig. 2| Emission lines from a Keplerian disk. The double- the widths and double-peak separations of the Ha (a) and 018,446 A(c) lines 
peaked emission lines of hydrogen (a), oxygen (b,c, e, f) and sulfur (d) detected are well reproduced for inner and outer disk radii of r,, = (1.0-1.3)R, and 
in the optical spectrum of WDJ0914+1914 originate in a gaseous circumstellar Tour ® (2.8-3.3)R,, respectively, consistent with the results from the Cloudy 
disk. Showninred are synthetic disk profiles computed by convolving the models (see Extended Data Fig. 4). The emission of [S 11] 4,068 A(d) extends 
Cloudy model that best matches the observed line flux ratios with the from about1R, to10R,. The V-shaped central depression of the 018,446 A (c) 


broadening function ofa Keplerian disk. Adopting an inclination of i=60°, line suggests that the line is optically thick. 
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Extended Data Fig. 3 | Dynamical constraints on the location of the 
circumstellar gas emitting the observed double-peaked emission lines. The 
gas inthe circumstellar disk follows Keplerian orbits, and hence the profile 
shape of the observed emission lines (see Fig. land Extended Data Fig. 2) 
encodes the location of the gas. The velocity separation of the double-peaks 
and the maximum velocity in the line wings correspond to motion of gas at the 
outer edge and inner edge of the disk, respectively. For a given inclination of 


the disk, these velocities map into semi-major axes. A lower limit onthe 
inclination, i>5°, arises from the finite size of the white dwarf (R,,),and an 
upper limit on the extent of the disk is provided for an edge-on, i=90°, 
inclination. The forbidden [S 11] 4,068 A line has amuch smaller separation of 
the double-peaks compared to Ha and 018,446 A, implying a larger radial 
extent. 
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Extended Data Fig. 4| Quality of the Cloudy fits. The line flux ratios ofa grid constant p (right). The observed emission line fluxes are reasonably well 
of Cloudy models spanning a range of gas densities, p, and radial distances reproduced by photo-ionized gas witha density of p = 10"? g cm and located 
from the white dwarf, r, from the white dwarf are compared to the observed at about (1-4)R,. 


values. The two histograms show the average quality for constant r (top) and 
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Extended Data Fig. 5 | Incident EUV flux and mass loss rates as a function of planet at WDJ0914+1914 is well within the warm Neptune desert. b, Mass loss 
orbital separation. a, Comparison of the irradiating EUV flux around T Tauri rates estimated from the assumption of recombination and energy limited 
stars (yellow-shaded region) and that of WDJ0914+1914 (red line). The outer hydrodynamic escape for aJupiter mass and a Neptune mass planet. 

border of the warm Neptune desert is indicated by the vertical dashed line. The Substantial mass loss could be generated even for separations of up toa few 
orbital separation of the planet orbiting WDJ0914+1914 estimated from the hundred solar radii, well beyond the estimated orbital location of the giant 


size of the accretion disk is about (14-16)R, (grey-shaded region). Subject to an planet atWDJ0914+1914. 
EUV luminosity comparable to that of planets around T Tauri stars, the giant 
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Extended Data Fig. 6 | Comparison of the the Lya emission of WDJ0914+1914 
with the Sun. a, Lyo irradiance of the Sun across a full solar activity cycle as 
measured by the SORCE SOLSTICE instrument. The radiation pressure on 
neutral interplanetary hydrogen in the solar system usually exceeds the 
gravitational force exerted by the Sun. b, The Lya flux of the Sun during 
minimum (2008) and maximum (2014) in comparison to the emission of WD 


JO914+1914 at a distance of 15R,. Given that WDJ0914+1914 is less massive than 
the Sun, and that its Lya flux is comparable to that of the Sun inthe core of the 
line, but much larger in the wings (even during the 2014 solar maximum), 
radiation pressure strongly impedes the inflow of hydrogen, explaining the 
large depletion of hydrogen with respect to oxygen and sulfur within the 
circumstellar disk. 
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Extended Data Fig. 7 | Final separation after common envelope evolution as 
afunction of planetary mass. We adopted two commonenvelope efficiencies, 
a=0.25 (solid line), and a=1.0 (dashed line) to calculate an upper limit for the 
final separation (d;,,,) if the progenitor of WDJ0914+1914 and the planet 
evolved through acommon envelope phase. The parameter space of possible 
outcomes of common envelope evolution lies below these lines (grey-shaded 
region). We consider the smaller efficiency to be more realistic. For 
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configurations below the red line (a,,,,,), the planetary mass object will 
evaporate inside the giant envelope; below the blue line (a,,), it would overflow 
its Roche lobe. Only planets with parameters within the green-shaded region 
can survive common envelope evolution. Whereas common envelope 
evolution can bring aJupiter-mass planet to the estimated location of the 
planet around WDJ0914+1914 (at (14-16)R,,), smaller planets will be evaporated 
inthe giant envelope. 
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Extended Data Fig. 8 | The evolution of the mass loss rate. White dwarfs cool dwarf will have cooled downto12,000K, the mass loss rate will drop below 
with time and as aconsequence their EUV luminosity decreases. We calculated about 10° gs7, and the resulting photospheric contamination by oxygen and 
model spectra for effective temperatures from 80,000 Kto10,000K, sulfur will become undetectable. Integrating the mass loss rate over the entire 
integrated the EUV flux, and determined the mass loss rate of aJupiter anda cooling time results ina total mass loss of about 0.002M,,,, which corresponds 


Neptune ata distance of 10R,. Ata cooling age of 364 million years the white to about 3.7% of the mass of Neptune. 
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Extended Data Table 1| White dwarf parameters 


effective temperature Teg [K] 27743 + 310 
surface gravity log g [cgs units] 7.85 + 0.06 
white dwarf mass Myq [Mo] 0.56 + 0.03 
cooling age [Myr] 13.3 + 0.5 
progenitor mass [Mo] 1.0—-1.6 

Gaia parallax [milli-arcsec] 2.17 + 0.47 
Uspss [mag] 18.629 + 0.026 
&SDSS [mag] 18.771 + 0.022 
spss [mag] 19.198 + 0.015 
ispss [mag] 19.529 + 0.022 
Zspss [mag] 19.849 + 0.087 
SDSS spectroscopic identifiers 53700 — 2286 — 0021 
[MJD-PLT-FIBER] 56017 — 5768 — 0660 


MJD-PLT-FIBER, the modified Julian date and plate and fibre numbers that identify the SDSS 
spectrum. 


Extended Data Table 2 | Element number abundances, log(Z/H) 


Element _photosphere disk disk solar 
scaled 
He < -2.1 -1.07 
C < -4.8 <-117 <-471 -3.57 
N <-3.7 <-100 <-454 -4.17 
O —3.25 + 0.2 0.29 + 0.3 —3.25  -3.31 
Na < -4,3 <-3.85 <-7.39 -5.76 
Mg < -5.8 <-2.10 <-5.64 -4.40 
Al < —6.0 <-2.70 <-6.24 —-5.55 
Si < —5.2 <-3.19 <-6.73 -4.49 
S -4.15+0.2 -0.21+0.3 —3.75 —4.88 
K <—5.2 <-3.67 <-7.21 -6.97 
Ca < -6.0 <-6.18 <-9.72 -5.66 
Mn < -4.4 -6.57 
Fe < —4.2 <-4.03 <-7.57 -4.50 
Zn < —3.0 —7.44 


The number abundances in the white dwarf photosphere were derived from fitting model spectra 
to the oxygen and sulfur lines detected in the X-Shooter spectrum. Upper limits were obtained 
from the non-detection of the strongest lines of the individual elements. The number abundances 
in the circumstellar disk were derived from fitting Cloudy models to the observed flux ratios of 
the emission lines of hydrogen, oxygen and sulfur, and upper limits for the remaining elements 
were obtained from the non-detection of corresponding emission lines. Hydrogen is strongly 
depleted in the disk. To facilitate the comparison between these two independent measurements, 
the column ‘disk scaled’ gives the abundances and upper limits obtained from the model of the 
gaseous disk scaled to match the photospheric oxygen abundance. Solar number abundances 
are provided as reference. 
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Exceptional points (EPs) are special spectral degeneracies of non-Hermitian 
Hamiltonians that govern the dynamics of opensystems. At an EP, two or more 


eigenvalues, and the corresponding eigenstates, coalesce’ °. Recently, it was 
predicted that operation of an optical gyroscope near an EP results in improved 
response to rotations**. However, the performance of sucha system has not been 
examined experimentally. Here we introduce a precisely controllable physical system 
for the study of non-Hermitian physics and nonlinear optics in high-quality-factor 
microresonators. Because this system dissipatively couples counter-propagating 
lightwaves within the resonator, it also functions as a sensitive gyroscope for the 
measurement of rotations. We use our system to investigate the predicted EP- 
enhanced Sagnac effect*” and observe a four-fold increase in the Sagnac scale factor 
by directly measuring rotations applied to the resonator. The level of enhancement 
can be controlled by adjusting the system bias relative to the EP, and modelling results 
confirm the observed enhancement. Moreover, we characterize the sensitivity of the 
gyroscope near the EP. Besides verifying EP physics, this work is important for the 
understanding of optical gyroscopes. 


The use of optical microresonators as sensors is being studied across a 
wide range of applications, including biomolecule® * and nanoparticle? 
detection, temperature measurement” and rotation measurement" ©, 
Arecently introduced approach to enhancing the response of optical 
microresonator sensors uses the physics of EPs**”°°; operation near 
an EP boosts the sensor response to a perturbation by an amount that 
increases with the proximity of the sensor’s operating point tothe EP". 
Here we study the EP-induced modification of the Sagnac effect ina 
microresonator ring laser gyroscope. 

The state vectors of a microresonator ring laser gyroscope are admix- 
tures of clockwise (CW) and counter-clockwise (CCW) optical modes 
and, as will be shown, their location ona Bloch sphere (Fig. 1a) is precisely 
controllable using Brillouin-induced dispersion”. This dispersion is 
applied independently to the CW and CCW directions using counter- 
propagating pump waves. Brillouin scattering causes a pump photon 
with frequency w,,;(j=1, 2) to scatter from a co-propagating acoustic 
phonon with frequency Qphonon into a backward-propagating Stokes 
photon. In the resonator, the associated phase-matching condition 
requires that the Brillouin shift frequency (OQphonon) iS close in value to an 
integer multiple (£) of the free spectral range (FSR) of the resonator, as 
shown in Fig. 1b (that is, Qphonon=€ X FSR). This is achieved by microfabri- 
cation control of the resonator diameter and in effect locates aresonator 
mode (the Stokes mode) within the Brillouin optical-gain spectrum for 
efficient stimulated Brillouin laser (SBL) action”*. Counter-pumping 
is performed on the same resonant-mode number (m) so that laser 
action occurs on two counter-propagating Stokes waves belonging to 
a single mode number (set to m— 6 inthis measurement; that is, ?=6). 


To better reveal the non-Hermitian physics of this system, we consider 
the equation of motion, which reads id ¥/dt=H,¥, where Y= (a, a3)" is 
the laser mode column vector with amplitudes normalized so that |q, | 
are the photon numbers of the CW and CCW components, respectively, 
and H, is the non-Hermitian Hamiltonian governing the time evolution: 


4. -(07! [g, Ail? - y/2] ix fi 
i ix too + lg, |ApI? y/2] 


Inthis expression A, and A, represent the photon-number-normalized 
amplitudes of the CCW and CW components of the pump modes, 
respectively, w, isthe unpumped frequency of the Stokes cavity mode 
and y is the cavity damping rate. g,= g)/[1+ (2iAQ,//)], forj=1, 2, is 
the Brillouin gain factor, where g, is the gain coefficient, lis the gain 
bandwidth and AQ;= @,;— @,— Ophononis the frequency mismatch”, with 
w, the Stokes lasing frequency (an eigenfrequency of equation (1)). 
The real part of the Brillouin gain factor leads to amplification of the 
Stokes mode, whereas the imaginary part is responsible for disper- 
sion and consequently mode pulling. x is the dissipative coupling rate 
between the two SBL modes and is examined in the Supplementary 
Information. 

In the absence of backscattering (x = 0), the CW and CCW SBL pro- 
cesses are independent because the Brillouin gain is intrinsically direc- 
tional as a result of the phase-matching condition (Fig. 1a). The 
steady-state lasing condition requires the power loss rate y to be bal- 
anced by the Brillouin gain, which leads to the clamping condition of 
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Fig. 1| Brillouin control of state vectors inanon-Hermitian system. a, Dual- 
SBL process ina microresonator. Centre, The green (blue) solid curve 
represents pump 1 (pump 2) with angular frequency w,, (@,,) and the red 
(yellow) solid curve denotes SBL 1 (SBL 2). The orange wavy line represents 
acoustic phonons with angular frequency Qphonon- Pumps and output waves are 
coupled onto the fibre-taper waveguide (light blue) . As discussed 

in Supplementary Information, the backscattering x is believed to originate 
primarily from the coupling of the waveguide to the resonator. Left, Brillouin 
energy and momentum conservation constraints (phase matching) for 
scattering of apump wave into a Stokes wave. w, angular frequency of the light; 
k, angular wavenumber of the light; n, refractive index; c, speed of light in 
vacuum. Right, CW and CCW modes experience dissipative coupling at rate x. 
This coupling creates eigenmodes that are mapped ontoa Bloch sphere 
containing dual EPs (black dots). The trajectories onthe Bloch sphere show the 
evolution of two eigenmodes (red for SBL1and yellow for SBL 2) under Brillouin 
control when the pump detuning frequency decreases from large positive to 
large negative values. The low-loss and high-loss eigenmodes inside the locking 


1+ (240; /1)? 
280 
Information, these conditions remain approximately valid for non-zero 
dissipative backscattering (x # O) within the regime in which EP- 
enhanced rotation measurement is performed (the unlocked regime 
defined below). As a result, above the lasing threshold equation (1) is 

simplified to the following form: 


the pump powers”, A; Pay . As shown in Supplementary 


Wot FAQ ik 
Ho = (2) 


ik Wot FAQ, 


With the introduction of x, the lasing system exhibits a frequency lock- 
ing-unlocking transition when varying the pump detuning frequency. 
The locking regime is known to create a sensing dead band for rotations 
in ring laser gyroscopes”. In the frequency-unlocked regime, the two 
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(see Supplementary Information for additional discussion). b, Efficient laser 
action requires that each Stokes mode (black; with linewidth y and separated 
fromthe pump bya multiple of the cavity FSR) lies within the Brillouin gain 
band (orange; with linewidth), which, through the phase-matching condition, 
is shifted relative to the pump by Q,nonon = 410NC,/A, (c,, speed of soundin silica; 
A,, pump wavelength). Here FSR ~ 1.8 GHz, so that 6 x FSR approximately 
matches the Brillouin shift of about 10.8 GHz. Dispersion from the Brillouin 
gain pulls the Stokes lasing modes by different amounts towards the gain 
centre on account of the difference in pump angular frequencies, Aw,.c, The 
blue solid curve (red dashed line) shows the dependence of the dual-SBL 
beating angular frequency, Aw,, onthe normalized pump detuning frequency, 
Aw,/Aq, for k #0 (k= 0) as per equation (4). The yellow wavy arrow represents 
the input rotation signal, and the blue (red) solid wavy arrow denotes the 
output signal with (without) the EP. The inset is the x# 0 Sagnac scale factor 
normalized to the x=0 scale factor and shows the enhancement near the EP. 
a.u., arbitrary units. 


lasing modes have distinct angular frequencies w,, and w,_, which are 


the eigenvalues of the Hamiltonian (equation (2)). 


_ y/(2r) 
Oe Teyit 


Bs (Aw, +,/Aw, — Aw?) (3) 


where @, = {@9 + [Y(@p1- Qphonon)/F IVIL + p/P] , A@, = @p2 — Wp: is the 
pump detuning frequency and Aw, = 2/xK/y is the critical pump fre- 
quency detuning at which the system state is at an EP. In deriving this 
result, it is important to note that the Hamiltonian (equation (2)) 
depends weakly on its own eigenvalues through AQ, and AQ, (see 
derivation in Supplementary Information). The SBL beating frequency 
is readily extracted by taking the difference of the above eigenfrequen- 
cies, AW, =@,,- @,.: 
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Fig. 2| Measurement of the eigenmode properties. a, Typical measured dual- 
SBL beating spectrum. b, Typical pump-SBL beating spectrum with the 
frequency axis shifted approximately by 10.845 GHz to centre the pump 1-SBL1 
beating-frequency peak. The individual pump-SBL beating-frequency peaks 
are identified. c, Measured dual-SBL beating frequency versus pump detuning 
frequency (blue circles). The red solid curve is a fit (with y//= 0.073 and 

«x =1.80 kHz) and the black dotted line corresponds to x= 0 (y/'=0.073). The 
data havea slope of 1/2 (slope of 1) near (away from) the EP in the log-log plot in 
theinset. These data correspond to a mode with larger x than that used ind.d, 
Measured shifted frequencies of the two SBLs, (@,,.— @,)/(2T1), versus the pump 


This equation is plotted in Fig. 1c. The dissipative coupling between 
the CW and CCW lasing modes induces second-order EPs at the critical 
pump-detuning frequencies, |Aw,| = Aw,, where the eigenfrequen- 
cies and the eigenmodes coalesce. For pump detuning |Aw,| > Aw, the 
eigenfrequencies bifurcate and the eigenmodes are an unbalanced 
hybridization of CW and CCW modes. For pump detuning |Aa,]|< Aw, 
the eigenvalues have equal real parts (frequencies) but different imagi- 
nary parts (loss rates). 

To verify the EP physics predicted above, a high-quality-factor 
(Q~10°) silica wedge resonator” (Fig. 1a) is counter-pumped at distinct 
frequencies determined by radio-frequency modulation ofa single laser 
(wavelength about 1,552.5 nm). Coupling into the resonator is realized 
from both ends of a fibre taper”*”’. One of the pump frequencies is 
Pound-Drever-Hall-locked to a resonator mode by feedback control 
onthe laser. The second pump frequency is then tuned for state vector 
control. The two pump powers are stabilized using power feedback. 
Further details about the experimental setup are provided in Methods. 

An electrical spectrum analyser is used to measure the photo- 
detected dual-SBL beating frequency Aw,/(2m) (Fig. 2a) and the SBL- 
pump beating frequency (Fig. 2b). For the latter, the detection is made 
along the direction of propagation of pump 1. Plots of these frequen- 
cies versus the pump frequency detuning are given in Fig. 2c, d. The 
provided comparisons with equations (3), (4) show good agreement 
between theory and measurement. Moreover, the ratio of the CCW com- 
ponents of the eigenmodes is determined from the strength of the beat 
notes between the CCW pump and the SBL signals (see Supplementary 
Information for analysis) and is plotted as the inset of Fig. 2d. There is 
a reasonable agreement between the model and the measurement. 
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detuning frequency. Theoretical values of (w,, — @,)/(211) and (@,_ — @,)/(211) with 
y/=0.076 and k=1.23 kHzare plotted as red and yellow lines, respectively. The 
experimental data of the shifted SBL 1 (SBL 2) frequency are shownas blue 
(purple) circles. The inset shows the measured power ratio of the CCW 
components of the lasing modes (blue circles), obtained by the analysis of the 
spectral components inb, and agrees reasonably well with the theoretical 
prediction (red solid curve). /,1.; (/pis2) is the strength of the beating signal of 
pump land SBL1(SBL 2). Inthe main plots of cand d, the errors in the frequency 
measurementare typically smaller than the marker size. 


When the resonator experiences an angular rotation rate of Q (posi- 
tive for the CW direction), the Sagnac effect induces opposing fre- 
quency shifts in the CW and CCW modes of a passive resonator such 
that the differential frequency shift of the CCW mode relative to the 
CW mode is A@sagnac = 21DO/(n,A), where D is the resonator diameter, 
n, is the group index of the passive cavity mode and Ais the laser wave- 
length”. Introducing the opposing frequency shifts (1/2 of the magni- 
tude of the differential shift) as a perturbation into H, (equation (2)) 
modifies the SBL beating frequency as follows: 


_ wt 
AOS Te yr 


[[A@) + FA@sagnac/VI - Aw? (5) 


Accordingly, the counter-pumped Brillouin system can function as a 
gyroscope to measure the rotation signal Q by monitoring the dual-SBL 
beating frequency Aw,. For comparison with the measurements, the 
small-signal Sagnac scale factor is now calculated as the derivative of 
the SBL splitting frequency with respect to the applied rotation rate 
amplitude Q: 


daw, 1 A®,  2nD 
a0 '2°°T4y/F [aw2- Aw? ngA (6) 


where a linear response requires [AQ@sagnac/V < A@,. In this equation, 
the coefficient 1/[1 + (y//N]is a correction resulting from the mode- 


S= 


pulling effect and Aw,/, |Aw;, - Aw2is the EP enhancement factor. This 
enhancement originates from the steep slope of the response curve 
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Fig. 3 | Measured Sagnac scale factor S(Aw,) compared with model results. 
The blue dots denote data and the red curve is the theoretical prediction of S 
through equation (6). Each point is an average of four measurements and the 
error bars denote one standard deviation. The Brillouin mode pulling factor, 
1/[1+ (y//)], slightly reduces the Sagnac factor at large pump detuning. The 
black dashed line gives the conventional (without EP enhancement and 
Brillouin correction) Sagnac factor. The inset shows a log-log plot of five data 
points near the EP. The red line has a slope of -1/2, further verifying that the 
Sagnac enhancementis proportional to (Aw;, - Aw?) ¥?. The approximation 
Aw? - Aw? = 2Aw,(Aw, - Aw,)is used for the horizontal axis scale. 


near the EP (Fig. 1c) so the scale factor S surpasses the conventional 
Sagnac value (that is, 27D/(n,A)). Also, we note that the sign of S depends 
on the sign of Aw, because the latter determines which SBL wave (CW 
or CCW) has the higher frequency. 

To measure rotations, the resonator is packaged in a small metal box 
with one edge hinged and the opposing end attached toa piezoelectric 
stage, ina manner similar to that used in ref.". (As an aside, ref." used 
a single pump ina cascaded SBL arrangement for rotation sensing. 
Such an arrangement, however, excludes EP physical effects because 
the underlying states occur at distinct cavity longitudinal modes.) To 
create a precise rotation, a sinusoidal oscillation is generated by the 
piezoelectric stage at a rate of 1 Hz witha fixed amplitude (equivalent 
to 410° h"). The resulting time-varying dual-SBL beating frequency is 
recorded using a frequency counter, and the amplitude of the modu- 
lated frequency is extracted by applying a fast Fourier transform to 
the counter signal. Frequency modulation amplitudes are recorded at 
several pump frequency detunings. The resulting Sagnac scale factor 
(that is, the amplitude of the SBL frequency-difference modulation 
divided by the amplitude of the applied rotation rate) is plotted in Fig. 3. 
A boost in the Sagnac factor by a factor of 4 compared to the non-EP- 
enhanced case is observed when operating close to the EP (that is, near 
the critical detuning frequency). There is good agreement between 
equation (6) and the measurement, as shown in Fig. 3. 

Whereas the Sagnac scale factor is observed to increase near the EP, 
fluctuation mechanisms exert a greater impact on the measurement, 
leading to relatively larger errors. To better understand the perfor- 
mance of the gyroscope, we record the beating frequency of the SBLs 
for up to 600 s without external rotation. The Allan deviation of the 
beating frequency is then calculated and normalized by the enhanced 
Sagnac factor S(Aq,) (fitted to the data) so as to arrive at the Allan 
deviation expressed in rotation-rate units (o,). The results are presented 
in Fig. 4 for several pump detuning frequencies both near and away 
from the EP. The smallest pump detuning frequencies are well within 
the region of EP-enhanced scale factors. At longer times, the Allan 
deviation data show a drift that is magnified near the EP. This drift is 
believed to be associated with temperature and power drift in the 
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Fig. 4 | Allan deviation of the gyroscope readout at various bias points. The 
Allan deviation of the gyroscope readout is measured at several pump detuning 
frequencies (Aw,/(2m), given in the key) both near (small detuning) and away 
from (large detuning) the EP. The error bars represent one standard deviation. 
Inset, Allan deviation of the noise of the SBL beat frequency for the same pump 
detuning frequencies. In this measurement k/(21) =1.36 kHzand p/I' = 0.0773. 


system and causes the fluctuations near the EP shown in the data in 
Fig.3. At short times, the Allan deviation exhibits angular random walk 
behaviour and decreases with increasing averaging time. Here, opera- 
tionin the EP region has little or no impact on the gyroscope sensitiv- 
ity. To further understand this result, the Allan deviation of the 
un-normalized frequency noise (,,,.;27)iS plotted in the inset of Fig. 4 
and shows that the frequency noise is enhanced in the vicinity of the 
EP. Indeed, this enhanced noise exactly offsets the scale factor enhance- 
ment inthe angular random walk regime of the main plot in Fig. 4. The 
role of technical noise” and the consideration of fundamental limits” * 
to the signal-to-noise ratio of sensors near an EP are recent areas of 
study, and the source of the enhanced noise inthe SBL system is under 
investigation. 

In summary, phase matching of Brillouin gain and dispersion ina 
microresonator system has been shown to provide precise control of 
the CWand CCWlaser modes near an EP. This control and the inherent 
high relative stability of the laser modes enable the observation of an 
enhanced system response to rotations due to the EP-induced modifica- 
tion of the Sagnac effect. By measuring rotations with an approximate 
amplitude of one revolution per hour, it is possible to observe a four- 
fold increase of the Sagnac scale factor near the EP. A corresponding 
sensitivity enhancement with respect to the rotation measurement, as 
inferred fromthe measurement of the Allan deviation, was not observed. 
This work provides a platform for studying EPs in a nonlinear optical 
system and specifically in the context of rotation sensing. 

Note added in proof. A study of the enhanced frequency noise in 
the angular random walk regime in Fig. 4 is presented in ref. *. 
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Methods 


Experimental setup 

Inthe measurement (see Extended Data Fig. 1), an erbium-doped fibre 
amplifier is used to boost the power of an external-cavity diode laser, 
whichis split into two arms. Each arm is frequency-shifted by an acou- 
sto-optical modulator (AOM) and coupled into the resonator. One arm 
of the laser is Pound-Drever-Hall-locked to the centre of the cavity 
mode, and the other armcan be freely tuned by radio-frequency tuning 
of the AOM. The resonator is shielded passively using insulating foam 
and the temperature is monitored by athermistor. Both pump powers 
are stabilized by active power-feedback control. When changing the 
pump detuning, the SBL power is almost unchanged (<8%). For the 
rotation measurement, one corner of the packaged gyroscopeis fixed 
ona pivot point and the other side is placed ona piezoelectric stage so 
that a precise sinusoidal modulation can be applied. 


Silica wedge resonator 

The silica wedge resonator used in this experiment is 36.0 mm in diam- 
eter and 8 pm in thickness. The wedge angle is approximately 30°, which 
is not critical to the measurement. The Brillouin shift in the silica wedge 
resonator is ~10.8 GHz. Details on the fabrication of the silica wedge 
resonator are provided in ref. 7. 


Data availability 


The data that support the plots within this paper and other findings of 
this study are available from the corresponding author upon reason- 
able request. 
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Gyroscopes are essential to many diverse applications associated with navigation, 


positioning and inertial sensing’. In general, most optical gyroscopes rely on the 
Sagnac effect—a relativistically induced phase shift that scales linearly with the 
rotational velocity”’. In ring laser gyroscopes (RLGs), this shift manifests as a 
resonance splitting in the emission spectrum, which can be detected as a beat 
frequency*. The need for ever more precise RLGs has fuelled research activities aimed 
at boosting the sensitivity of RLGs beyond the limits dictated by geometrical 
constraints, including attempts to use either dispersive or nonlinear effects> °. Here 
we establish and experimentally demonstrate a method using non-Hermitian 
singularities, or exceptional points, to enhance the Sagnac scale factor? @. By 
exploiting the increased rotational sensitivity of RLGs in the vicinity of an exceptional 
point, we enhance the resonance splitting by up to a factor of 20. Our results pave the 
way towards the next generation of ultrasensitive and compact RLGs and providea 
practical approach for the development of other classes of integrated sensor. 


Sensing involves the detection of the signature that a perturbing agent 
leaves on a system. In optics and many other fields, resonant sensors 
are made to beas lossless as possible so as to exhibit high quality fac- 
tors”. As aresult, their response is governed by standard perturba- 
tion theory, suited for loss-free or Hermitian arrangements®. Recently, 
however, there has been a growing realization that non-Hermitian 
systems biased at exceptional points (EPs)'*”°, can react much more 
drastically to external perturbations’’°”.. This EP-enhanced sensitivity— 
a direct byproduct of Puiseux generalized expansions—is fundamental 
by nature. In particular, for asystem supporting an Mth-order EP, where 
Neigenvalues coalesce and their corresponding eigenvectors collapse 
oneach other, the reaction to a perturbation (€) is expected to follow 
an Mth-root behaviour? (e””). This is in stark contrast to Hermitian sys- 
tems, where the sensing response is at best of order e. Given that e! > 
€for |e| <1, this opens up new possibilities for designing ultrasensitive 
sensors based on such non-Hermitian spectral singularities” “. For 
illustration purposes, Fig. 1 provides a comparison between the eigen- 
value surfaces associated with a Hermitian (Fig. 1a) two-level system 
(N=2) and its corresponding non-Hermitian counterpart (Fig. 1b) when 
plotted in atwo-parameter space around their corresponding spectral 
degeneracies. As shown in Fig. 1b, the presence of an EP forces the two 
Riemann manifolds to become strongly intertwined with each other—an 
attribute that could in turn be used to enhance the performance of a 
sensor~. 

Given that sensing is important in many fields, the emerging idea of 
boosting the sensitivity of a particular system via non-Hermitian degen- 
eracies could have substantial ramifications across several technical 
areas. Here, we show that the sensitivity of a standard helium-neon 
(He-Ne) RLG can be drastically enhanced provided that its resona- 
tor is judiciously modified so as to support an EP. Figure 2 depicts a 


schematic of the non-Hermitian RLG used in this study. As opposed to 
a standard RLG, the retrofitted cavity involves a Faraday rotator (FR) 
and a half-wave plate (HWP). These two elements, acting in conjunc- 
tion with the Brewster windows (BW) incorporated on both ends of the 
He-Ne gain tube, can be used to introduce a differential loss contrast 
(or gain contrast), Ay, between the clockwise (CW) and the counter- 
clockwise (CCW) lasing modes. The method used to achieve this is 
depicted in Fig. 2a, where the evolution of the state of the polarization 
associated with the two counter-rotating modes is provided at three 
consecutive points (A, B, C) inthe cavity. In this arrangement, the BWs 
allow only x-polarized light to circulate in the cavity while rejecting the 
ycomponent. Asa result, the CW mode enters the FR as x-polarized at 
point A. Because of the magneto-optic effect, the polarization subse- 
quently rotates by asmall angle a (point B). Under the action of the 
HWP, the angle between the linear electric-field component and the 
preferred x axis is 28 - a (point C), where the small angle 6 denotes the 
orientation of the fast axis of the HWP with respect to the x-y coordinate 
system. On the other hand, because of non-reciprocity, although the 
CCW mode also starts as x-polarized at point C, it exits at an angle of 
26+ awith respect to the x axis (point A) after traversing the same two 
optical components. Therefore, as clearly indicated in Fig. 2a, the CW 
mode is expected to experience lower losses than its CCW counter- 
part does, after passing through the BWs of the He-Ne tube. Hence, 
a differential loss (Ay) can be introduced between these two counter- 
rotating modes. Finally, to establish an EP in this cavity, itis necessary 
to counteract this differential loss with a mode-coupling process”. In 
our system, the coupling between the CW and CCW modes is readily 
induced using a weakly scattering object (for example, an etalon), 
as shown in Fig. 2a. The aforementioned processes can be formally 
described by employing aJones calculus approach for the elements 
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Fig. 1| Conceptual illustrations comparing the eigenvalue surfaces 
associated with Hermitian and non-Hermitian two-level systems. a, The real 
part of the eigenvalues plotted in parameter space (7-¢; normalized detuning, 
n, versus normalized coupling/gain-loss contrast, 7) when the arrangementis 
Hermitian. Because of the Hermiticity, this system responds linearly to 


involved (HWP, FR, BW, scattering object), where the polarization state 
of the CW and CCW waves can be monitored after each pass through the 
following transfer matrix T= Sg X PXJuwp XJer XJpw (See Supplementary 
Information). In this expression, S;- represents a conservative scatter- 
ing matrix (producing coupling) and Pdenotes a phase accumulation 
matrix that can in principle account for the Sagnac shift’. The matrices 
JuweJ er andJpw are the respective Jones matrices describing the change 
of polarization after each element”. 

To experimentally demonstrate this enhanced Sagnac sensitivity, 
we use a custom-made, educational-grade He-Ne RLG (purchased 
from Luhs; https://luhs.de/Im-0600-hene-laser-gyroscope.html). The 
triangular cavity has a length of 138 cm and supports a free spectral 
range of about 216 MHz at 632.8 nm. The maximum loss that can be 
afforded in this system is approximately 3.6%. This resonator is then 
retrofitted with a terbium gallium garnet (TGG) Faraday element that 
can provide up to about 4° rotation at a magnetic induction of about 
80 mT. This is used in conjunction with a HWP with a rotation angle 
that can vary ina controlled manner with a resolution of 0.005°. An 
etalon in the cavity promotes lasing in a specific longitudinal mode 
while providing some level of coupling between the CW and CCW 
modes. Other elements, suchas the TGG, also contribute to this cou- 
pling. Overall, the system is designed to allow maximum tunability 
in establishing an EP. 

Figure 2b, c provides a comparison between the principles of oper- 
ation of a standard RLG and the EP-based RLG arrangement used in 
this study. In the former configuration, the Sagnac effect produces a 
shift (+Aq@,/2) in the lasing CW and CCW angular frequencies (which 
at rest coincide at w,), where the beating frequency Aw,/(2Tt) = 4AQ/ 
(AoL) depends on the angular velocity 0 of the rotating frame, the area 
Aenclosed by the light path (of perimeter L) and on the emission wave- 
length, A) = 21tc/w, (c, speed of light in vacuum). Evidently, the beating 
frequency Aw,/(2Tt) in this Hermitian setup (which is electronically 
detected) is always proportional to 0 and is dictated by geometrical 
constraints (Fig. 2b). The situation is entirely different for the non- 
Hermitian configuration, where the carrier angular frequency w) can 
split by +Aw,/2 even in the absence of rotation because of coupling 
effects arising from the scatterer (Fig. 2c). In this same static frame, 
by adjusting the differential loss Ay, these two resonances can fuse 


perturbations. b, The real part of the eigenfrequency surface corresponding to 
anon-Hermitian configuration in the same parameter space. In the presence of 
an EP, the two Riemann manifolds are strongly intertwined, leading to a square- 
root response to perturbations, as indicated by the frequency of the emitted 

signal. Using this system, an enhanced sensitivity to small changes is expected. 


with each other once again at about wp, thus marking the presence of 
an EP. After the system is set at an EP, upon rotation QO, the Sagnac 
shifts +Aw,/2 induce two new angular frequency lines at @) + A@;p/2 
(Fig. 2c). In this case, the beating frequency A@-p/(2T1) is no longer 
proportional to Q, but instead varies in an enhanced fashion because 
in this regime Awsp « JQ, as expected when operating in the vicinity 
of an EP (Fig. 2c). 

The frequency eigenvalues of the non-Hermitian RLG can be directly 
obtained from the transfer matrix T, after imposing periodic boundary 
conditions. From this, the induced non-Hermitian splitting Aw,» can 
be obtained, which interestingly enough remains unaffected even in 
the presence of gain saturation (see Supplementary Information). On 
the basis of these results, under rest conditions, one can compute the 
frequency split associated with the CW and CCW counter-propagating 
modes in our system asa function of the HWP angle when, for example, 
the coupling strength is set to x = 400 kHz (Fig. 3a). In this case, an 
EP appears at B = 4.7°. The corresponding magnitude of the complex 
eigenvalues |A, | of the system is plotted in Fig. 3b. The frequency beat- 
ing signals expected from the Hermitian (orange line) and the non- 
Hermitian (black line) configurations of the RLG are plotted in Fig. 3c 
asa function of Q. Inthe non-Hermitian case, we assume that the system 
is positioned at an EP (x= Ay) when k= 400 kHz. The EP enhancement 
of the Sagnac shift is evident in this figure. For these parameters, if, 
for example, the system rotates at Q=1°s", the Sagnac signal from the 
unmodified version of this RLG (Hermitian) is approximately 7.325 kHz, 
whereas the signal from the retrofitted (EP-based) system is expected 
to be about 5.2 times larger. Finally, Fig. 3d shows the change of beat 
note as a function of gyration speed when the system deviates from 
the EP (by 0.05% to 0.1% of the coupling strength). Although ideally 
one must keep the system at the EP, for small deviations the resulting 
error appears to be negligible. 

Figure 4a depicts experimental results obtained from our non- 
Hermitian RLG system when it was biased at an EP. In our experiments, 
before each set of measurements performed, the system was positioned 
at an EP by monitoring the beat note asa function of the HWP rotation 
angle (gain-loss contrast), thatis, setting the beat frequency as close as 
possible to zero. To do so, the HWP rotation angle was adjusted using 
a motorized rotation stage while the other components inthe system 
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Fig. 2| Principle of operation of an EP-based He-Ne RLG. a, The equilateral 
RLG cavity comprises three highly reflective mirrors (M,, M,,M,),a He-Ne gas 
tube as the gain medium, and an etalon used to select the desired longitudinal 
mode(s). In contrast to a standard RLG, the EP-based cavity also includes a 
Faraday rotator (FR) anda half-wave plate (HWP). These two elements, in 
conjunction with the Brewster windows (BWs), introduce a differential loss 
between the clockwise (CW; A> C) and counter-clockwise (CCW; C > A) 
directions. b, Inastandard RLG, the Sagnac effect induces a shift (+Aw,/2) inthe 
stationary lasing angular frequencies (w) associated with the CW and CCW 
modes. The resulting angular beating frequency Aw, is proportional to the 
angular velocity 0 of the rotating frame. c, In an EP-based arrangement, the CW 
and CCW modesare first coupled to each other owing to the presence of weak 
scattering in the system. Consequently, the stationary lasing angular 
frequency (@,) splits according to the ensuing coupling strength (tAw,/2).On 
the other hand, the loss contrast (Ay) induced by the simultaneous action of the 
FR, HWP and BWs brings the two split modes back to wo, that is, toan EP. Once 
the RLG is biased at the EP, the gyration will lead toa beat frequency that is 
proportional to JO. It is expected that for small rotation rates, the beat note of 
the EP-based RLG will be considerably enhanced in comparison to that ofa 
standard arrangement. 


were left intact. The figure provides data corresponding to three dif- 
ferent coupling strengths, along with data from the standard unmodi- 
fied RLG arrangement. These results are plotted ina log-log scale asa 
function of the rotation rate 0 for x= 65 kHz, 150 kHz, 425 kHz. Whereas 
the response of the standard configuration is linear with respect to.Q 
(slope of 1), the same is not true for its non-Hermitian embodiment. 
In the latter case, the response is found to vary as the square root of 
the rotation rate Q, as is evident from the slope of the accompanying 
three curves, which is very close to /2—a clear indication that an EP 
is at play. Our experimental observations clearly show that the scale 
factor of the Sagnac effect is substantially boosted by exploiting the 
very properties of EPs. The resulting Sagnac enhancement factors (with 
respect to the standard arrangement) are plotted in Fig. 4b for the same 
three cases. For x = 425 kHz, a sensitivity boost of more than an order 
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Fig. 3 | Bifurcations of complex eigenfrequencies and sensitivity 
enhancement of EP-based RLG around an EP. a, Resonance frequencies 
associated with the two coupled modes of an EP-based RLGat rest versus the 
HWP angle when the coupling strength x is set to 400 kHz. These plots are 
obtained from Jones matrix analysis after considering gain saturation effects 
(see Supplementary Information, equation (16)). The EP in this system occurs at 
an HWP angle of about 4.7°. b, Magnitude of the same eigenvalues as a function 
of HWP angle when the non-Hermitian RLG is stationary (see Supplementary 
Information, equation (3)).c, Beat frequency asa function of angular speedQ 
(in log-log scale) for astandard RLG (orange) and anon-Hermitian RLG (black) 
(see Supplementary Information, equation (6)). The RLG is set exactly at the EP, 
where the loss contrast balances the coupling (Ay=«x=400 kHz). For the 
standard RLG, the slope of the curve is unity, whereas it is reduced to % for the 
non-Hermitian arrangement. d, Calculated beat frequency for the non- 
Hermitian RLG asa function of rotation rate O (see Supplementary 
Information, equation (6)) when the loss contrast does not exactly balance the 
coupling (the differential loss differs from the coupling by 0.05%-0.1%) and 
hence the system is not precisely located at the EP. Whereas at large rotation 
rates the slopeis approximately 4, it deviates from this value at small angular 
velocities when Ay # x. When Ay>x (above the EP) the beat frequency (blue 
lines) is below that of the ideal case (black line), which indicates areduced 
sensitivity to the rotationQ. On the other hand, when the system is biased 
below the EP (Ay <x), the beat frequency does not exhibit a strong dependence 
onthe gyration speed. 


of magnitude is observed when 0 = 0.4° s 7. The reported minimum 
gyration speed, Q=0.1° s“, is imposed by the limited rotation capa- 
bility of the apparatus. The estimated rotation rate is obtained from 
the beat frequency by applying the transfer functions associated with 
the Hermitian and non-Hermitian arrangements. These transfer func- 
tions are illustrated in Fig. 5a, where it can be observed that for small 
angular velocities, not only the absolute value of the beat frequency 
(Avrp > Avs; where AVep = A@ep/(211) and Av, = Aws/(2Tt) is the rotation- 
induced beat frequency in the EP-based RLG and the standard RLG, 
respectively) increases dramatically for the EP-based system, but also 
anincremental step in the rotation rate is transferred toa much larger 
difference in the beat frequency (|Av;p 2 — AVepil > [AVs,2 - AVs,l). AS a 
result, the resolution of the estimated rotation speed is potentially 
improved. Figure 5b, c displays the error bars on the estimated rotation 
rates, as obtained experimentally for the Hermitian and non-Hermitian 
arrangements, respectively. In a standard gyroscope, the relationship 
between the applied and estimated angular velocities is linear. Onthe 
other hand, owing to the nonlinear transfer function associated with 
the non-Hermitian system, these two quantities are not on an equal 
footing anymore. In this respect, only when considering the nonlinear- 
ity of the transfer function, the errors on the estimated rotation rate 
can beinterpreted correctly. Consequently, at higher rotation speeds, 
the modified gyroscope displays larger error bars, as shown in Fig. 5c. 
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Fig. 4 | Measured beat frequency and sensitivity enhancement factor versus 
rotation rate. a, Beat frequency versus rotation rate 0 (inlog-log scale) fora 
standard RLG (orange data marks) and anon-Hermitian RLG at three different 
coupling strengths, x,=65 kHz (green), K,=150 kHz (blue) and x,=425 kHz 
(red). The dashed lines correspond to theoretical calculations in which the non- 
Hermitian system is biased exactly at the EP (Ay;=«;). The solid lines represent 
fitted data obtained when the system is slightly detuned from the EP 

(see Supplementary Information, equation (11)) (here Ay,=1.0003x,, 

Ay, =0.9992k,, Ay; =0.9999x;). The orange line has a slope of unity, indicating 
that the Sagnac shift in the standard cavity varies linearly with Q. By contrast, 
the slope associated with the non-Hermitian curves is approximately 14, 
indicating the presence of an EP. Moreover, whereas in the standard RLG the 
lock-in effect limits gyration measurements below.0=0.7° s! (shaded region), 
the EP-based configuration is capable of detecting smaller rotation rates (only 
limited by the resolution of the step motor, 0.1° s“). The error bars show one 
standard deviation from the mean of the collected data. b, The sensitivity 
enhancement, defined as the ratio of the non-Hermitian beat frequency to that 
ofastandard RLG, is obtained from the measured data for the aforementioned 
three coupling strengths. For Q<0.7°s", the sensitivity enhancement is 
calculated using the anticipated value of the beat frequency from the standard 
RLG, provided that lock-in does not occur. The solid lines (red, blue and green) 
represent theoretical curves corresponding to the parameters used in our 
experiments. 


The advantage of the EP-based gyroscope becomes apparent at smaller 
velocities, where the error in the estimated rotation rates decreases 
rapidly. This is depicted in Fig. 5b, c, where anoise component of 3 kHz 
has been added to the ideal system (shown as orange and red shaded 
regions). 

Several factors must be considered when using non-Hermitian 
arrangements for sensing purposes. First and foremost is appreciat- 
ing the difference between sensitivity and detection limit”. In non- 
Hermitian settings, the sensitivity enhancement is a fundamental 
feature that is dictated by mathematical properties, governed by the 
perturbation expansion around an EP. The detection limit, onthe other 
hand, depends on the physical system and is primarily determined by 
the net gain (or loss), as well as the correlation between the laser noise 
associated with the two resonances”*””. In this regard, one in principle 
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Fig. 5| Transfer functions and estimated rotation rates. a, The transfer 
function of the standard system (orange line)—that is, its response toa beat 
frequency—is compared to that of the EP-based RLG for x= 65, 150, 425 kHz 
(green, blue and red lines, respectively). For the non-Hermitian gyroscope, at 
small rotation rates both the absolute values of the beat frequency (Av;p > Avs) 
and of the beat frequency differences (|AV¢p..— AV_p,| > |AVs,. — AVs,1|) for an 
incremental step inthe rotation rate increase dramatically. b,c, Predicted 
rotation rate for the standard RLG (b) and for the modified non-Hermitian RLG 
at x= 425 kHz (c), obtained by applying the associated transfer functions to the 
measured beat frequencies. The shaded areas demonstrate the effect of noise 
(3 kHz) onthe estimated rotation rates. The error bars show one standard 
deviation from the mean of the collected data. 


can increase the net gain while keeping the RLG at the EP by manag- 
ing the gain contrast to boost both the sensitivity and the detection 
limit—as we did in our design. As expected from the Schawlow-Townes 
formula, an increase in the net (average) gain of the system (or the 
output power) will reduce the linewidth of the laser. This in turn tends 
to compensate for the linewidth broadening near the EP while allow- 
ing one to exploit the larger sensitivity afforded by such singularities. 
Another technical issue is how closely one can reach and stabilize the 
system at the EP*®”*. In our experiment, we fully rely on positioning the 
RLG at the EP before each set of measurements, by visually monitoring 
the beat note as a function of the HWP rotation angle (gain-loss con- 
trast). In future devices to be used in the field, one may need to actively 
control the system to remain biased at the EP. Such approaches have 
been suggested elsewhere”””. 

Inconclusion, we have demonstrated for the first time, to our knowl- 
edge, anewclass of non-Hermitian RLGs that can display an enhanced 
Sagnac sensitivity. This is accomplished by exploiting the intriguing 
properties of a special family of non-Hermitian spectral singularities, 
the EPs. At these points, the RLG response has a square-root depend- 
ence onthe gyration speed, in contrast to the linear response observed 
in standard arrangements. The proposed configuration may inspire new 
technological developments in various settings in which measuring low 
rotation rates via ultracompact systems is highly attractive. Finally, the 
idea of transforming a standard measuring apparatus into an EP-based 
device with superior sensitivity may have important ramifications in 
other areas of science and technology. 
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Hydrodynamics, which generally describes the flow of a fluid, is expected to hold 
even for fundamental particles such as electrons when inter-particle interactions 
dominate’. Although various aspects of electron hydrodynamics have been revealed 
in recent experiments? , the fundamental spatial structure of hydrodynamic 
electrons—the Poiseuille flow profile—has remained elusive. Here we provide direct 
imaging of the Poiseuille flow of an electronic fluid, as well as a visualization of its 
evolution from ballistic flow. Using a scanning carbon nanotube single-electron 
transistor”, we image the Hall voltage of electronic flow through channels of high- 
mobility graphene. We find that the profile of the Hall field across the channel is a key 
physical quantity for distinguishing ballistic from hydrodynamic flow. We image the 
transition from flat, ballistic field profiles at low temperatures into parabolic field 
profiles at elevated temperatures, which is the hallmark of Poiseuille flow. The 
curvature of the imaged profiles is qualitatively reproduced by Boltzmann 
calculations, which allow us to create a ‘phase diagram’ that characterizes the 
electron flow regimes. Our results provide direct confirmation of Poiseuille flow 
inthe solid state, and enable exploration of the rich physics of interacting electrons 


inreal space. 


The notion of viscosity arises in hydrodynamics to describe the diffu- 
sionof momentum ina fluid under the application of shear stress. When 
scattering between constituent fluid particles becomes dominant, vis- 
cosity manifests as an effective frictional force between fluid layers. The 
hallmark of such hydrodynamic transport inachannelisa parabolic, or 
Poiseuille, velocity flow profile, which typifies familiar phenomena like 
water flowing through a pipe. Electron flow has long been predicted’ 
to undergo hydrodynamic transport when the rate of momentum- 
conserving Coulomb scattering between electrons exceeds that of 
momentum-relaxing scattering from impurities, boundaries and pho- 
nons??"*, The implications of a dominant viscous force on electronic 
flow have been studied ina range of theoretical work”. While initial 
efforts were based on linearized Navier-Stokes equations, describing 
electron hydrodynamics in the context of diffusive transport*!8°?, 
there is an evolving understanding that a central part of the physics is 
the emergence of hydrodynamics from ballistic flow?®”??*8, Reaching 
the hydrodynamic regime experimentally requires materials of high 
purity so that Ohmictransport can be minimized, which is now possible 
ina growing number of high-mobility systems. Indeed, recent experi- 
ments have demonstrated the existence of negative non-local resist- 
ance®’, superballistic flow®, signatures of Hall viscosity’”, breakdown 
of the Wiedemann-Franz law*’, and anomalous scaling of resistance 


with channel width”, all phenomena associated with hydrodynamic 
electron flow. Yet the direct observation of the fundamental Poiseuille 
flow profile has remained elusive. 

In this work, we provide the first, to our knowledge, spatial imag- 
ing of Poiseuille flow of hydrodynamic electrons, as well as the evolu- 
tion from ballistic to hydrodynamic flow. We use a scanning carbon 
nanotube single-electron transistor (SET) to non-invasively image 
maps of the longitudinal and Hall voltage of electrons flowing through 
high-mobility graphene/hexagonal boron nitride (hBN) channels”. 
By varying the carrier density (degenerate regime away from charge 
neutrality) and temperature, we tune the two relevant length scales 
controlling electron flow: the momentum-relaxing mean free path, set 
by electron-impurity and electron-phonon scattering, andthe momen- 
tum conserving mean free path, set by electron-electron interactions. 
We find that the spatial profile of the Hall field across the channel is 
key to distinguishing the evolution from ballistic into hydrodynamic 
flow. At low temperatures, we observe flat profiles associated with 
ballistic flow. At higher temperatures the profiles become parabolic, 
with curvature approaching that of ideal Poiseuille flow. Overall, we 
find that Boltzmann kinetic equations qualitatively reproduce our 
observations, although at the highest temperatures they underesti- 
mate the curvature of the Hall field profiles. Finally, we show that this 
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Fig. 1| Overview of graphene channel device and imaging of 
magnetoresistance. a, Optical image of graphene channel device used for 
imaging electron flow, consisting of a high-mobility monolayer of graphene 
sandwiched between hBN layers (purple) and electrical contact electrodes 
(yellow) ontop of the conducting Si/SiO, back gate. The dark lines are etched 
walls that define achannel of width W=4.7 pm and lengthZ =15 pm (outlined 
with dashed box; scale bar, 2.5 um). b, Rendering of scanning SET imaging 
performed in experiments. The nanotube-based SET is positioned at the 

end of ascanning probe cantilever, and is rastered across the channel 
(graphene in purple, sandwiched between hBN layers atop a Si/SiO, substrate 
in blue) to locally image the potential generated by the electrical current/in 

a perpendicular magnetic field B.c, Magnetoresistance of graphene channel at 
atemperature of 7=7.5K, antisymmetrized in B, imaged non-invasively with 


curvature is the distinctive metric for characterizing the different flow 
regimes, allowing us to construct a phase diagram and map the regions 
explored by the experiment. 

The devices we studied are high-mobility monolayer graphene/hBN 
heterostructures patterned into channels of various lengths, L, and 
widths, W. Below we present data from a device with W= 4.7 um and 
L=15 um (Fig. 1a), but similar results have been obtained for a device 
witha different width, aspect ratio, and etched boundaries (see Meth- 
ods and Extended Data Fig. 5). 

We first perform the scanning analogue” of transport measurements 
of longitudinal resistivity, p,,. Flowing current /through the channel 
and imaging the potential produced by the flowing electrons, @(), 
along the centreline (dashed line in Fig. 1b) yields p, = wey. Figure lc 
shows p,,.as a function of perpendicular magnetic field, B, for various 
carrier densities, n, ata temperature of T=7.5 K. Notably, with increas- 
ing |n|, p,,, evolves from a single- to double-peaked structure. This is a 
well known signature of ballistic electron transport (/yp > W, where dup 
is the momentum-relaxing mean free path), when scattering at the 
walls is diffusive’”’ (see Methods and Extended Data Fig. 2). The B 
dependence of p,, is set by the ratio of Wand the cyclotron radius, 


R,= said (his the reduced Planck constant and e is the electron 
charge). For |W/R,|>2, backscattering is strongly suppressed, and Boltz- 
mann theory predicts” that p,,.is determined primarily by bulk scat- 
tering (with correction proportional to |W/R,|"), allowing us to estimate 
lp (See Methods and Extended Data Fig. 1). Figure 1d plots the extracted 
dug asa function of nat several different temperatures. For T=7.5K, Lup 
exhibits the expected |n|-dependence, while at T= 75 K and 150 K, lp 
displays a characteristic flat density dependence due to the addition 
of phonon scattering®°. 

We now turn to the Hall voltage profiles, which are fundamentally 
related to the current flow profiles of electrons in the channel. We 
restrict all subsequent analysis to the bulk of the channel (|y/W| < 0.3), 
in which the convolution of the potential jump near the channel edge 
with the point spread function of our SET due to the imaging height 
(h= 880 nm) is negligible (Fig. 2a, b). In the Ohmic regime (lp « W), 
there isa local relation between the y-component of the Hall field, £,= 
dV js.1/dy (where V,,., is the local Hall voltage), and the current density 
parallel to the channel axis, /,, given by £, = (B/ne)j,. Inthe hydrodynamic 
regime, where l/,. < W (where /,, is the electron-electron scattering 
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scanning SET. The SET is scanned along the centreline of the channel (black 
dashed line in b) to image the potential drop A@ in order to extract the 
longitudinal resistance p,,. = weil asa function of magnetic field B for 
different charge carrier densities n (black curve is low density, high density in 
green; numbers label the density of each curve). Inset, the same p,, data plotted 
asa function of W/R,, whichis proportional to B (see text). At high density, the 
magnetoresistance curves showa double-peaked structure, indicating ballistic 
transport with diffusive walls (see Methods). d, Momentum-relaxing mean free 
path Jy, inthe bulk of the graphene channel asa function of carrier density for 
several temperatures. The SET is maintained at liquid helium temperature 
throughout all measurements™.The value of [yp is deduced from p,..(B) and is 
described in Methods, which also presents the associated mobility (Extended 
Data Fig. 1). 


length), the current density is predicted to be parabolic, leading to an 
analogous relation™ (see Methods): 


BY, law, 
Ee Zl ole; i,] (1) 


Deep inthe hydrodynamic regime, where /,./W<1, the local relation 
between £, andj, is recovered to a good approximation. Imaging F,(y) 
inthese regimes therefore effectively images the current distribution, 

Ji Ay). Inthe ballistic regime, this relation breaks down, leading toa fun- 
damentally different £, profile. As we show, £, is then a key observable 
for distinguishing between ballistic and hydrodynamic flows. Figure 2c 
shows the potential along y measured at small magnetic fields 
B=+12.5 mT, antisymmetrized in B, to yield the Hall voltage pro- 
fileVan(y) = 5190, B) - @(y, - B)], where T=7.5Kandn=~-1.5x10" cm”. 
Note that Bis small enough that the flow remains semiclassical (Landau 
level filling factor v> 100 and hw, « k,T, where w, is the cyclotron fre- 
quency and k, is the Boltzmann constant). The £,(y) profiles 
are obtained by numerically differentiating the imaged V,,.(y) 
profiles. 

We now observe howelectron-electron interactions affect the Hall 
field profiles by comparing imaging at different temperatures: T=7.5K 
in Fig. 2e, and T=75 K in Fig. 2f. While increased temperature should 
increase the electron-electron scattering rate (decrease l,.) it also 
increases electron-phonon scattering (decreases the electron-phono, 
mean free path, /,,) and correspondingly reduces lyp = (fimp+ [sn) , 
where /,,,, is the impurity scattering mean free path. To best isolate the 
influence of /,,, we therefore maintain a nearly constant /y, across the 
different temperatures by tuning the carrier density between the meas- 
urements (circles in Fig. 2d; see legend for details). Notably, the imaged 
profile at T=7.5 Kis flat across the bulk of the channel (Fig. 2e). In con- 
trast, the profile at T= 75 K is strongly parabolic (Fig. 2f). The dramatic 
difference in curvature between these profiles becomes more apparent 
when we image the full two-dimensional maps of the Hall field (within 
the black box in Fig. 2a), demonstrating that the profiles are independ- 
ent of position along the channel (Fig. 2g, h). All measurements are 
performed at small enough magnetic field (W/R, = 1.3) to minimally 
influence the profiles, as well as low voltage bias across the channel to 
avoid electron heating (see Methods and Extended Data Figs. 3, 4). 
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Fig. 2 | Imaging ballistic and Poiseuille electron flow profiles. a, Graphene 
channel with overlay indicating the region over which flow profiles are imaged. 
One-dimensional profiles are taken along the dashed line and two-dimensional 
profiles are imaged across the region enclosed by the black box (scale bar, 
2.5m). b, Potential of flowing electrons, @, as a function of the y coordinate 
(dashed line in a) imaged at B=0 (blue curve, T=7.5 K). The dashed yellow curve 
isa boxcar function convolved with the point spread function of our SET 
measurement, determined primarily by the height of our SET detector above 
the graphene during the scan. Grey-shaded regions (0.3 <|y/W|<0.5) indicate 
where the smearing of the steps at the edges due to the finite spatial resolution 
has anon-negligible contribution. c, Imaged Hall voltage, V,,.,, from 
antisymmetrizing measurements taken at field B=+12.5 mT, n=-1.5 x10" cm? 
and T=7.5K. Normalization /,,R,=470 pV. d, (yp from Fig. 1d, but now 
normalized by W. Dots indicate the carrier densities of the profile imaging in all 
subsequent panels, where n=-1.5 x10" cm at 7.5 Kandn=-3.1x10"'cm~at 


One naively expects the current density profile, j,(y), to be flat for 
ballistic flow and parabolic for hydrodynamic Poiseuille flow. How- 
ever, a full Boltzmann theoretical calculation of the profiles of j,, and 
E, including lyp (Fig. 2i,j and Methods) reveals that this is not the case. 
The/, profile, even deep in the ballistic regime (dyp/W > 1), is not flat 
(see Methods and Extended Data Fig. 8). Figure 2i plots the/, profile 
calculated for [yp/W=2 and [,./W = 4.3, consistent with our measure- 
ments at 7=7.5K, showing that/, has large curvature. In fact, the Boltz- 
mann theory predicts a strongly curved), profile even for much larger 
lur/W, showing that the ballistic /, profile is not qualitatively differ- 
ent from its hydrodynamic counterpart (an example calculated for 
IMp/W = 1.4 and [../W= 0.16 is shown in Fig. 2j), and is therefore a weak 
marker for the emergence of electron hydrodynamics. In contrast, the 
Boltzmann theory shows that the £, profile differs markedly between 
ballistic and hydrodynamic flows, making it a way of distinguishing 
these regimes. In the ballistic regime £, is flat (Fig. 2i), and can even 
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75K, chosen such that [y,is nearly equal for both temperatures. e, The Hall 
field, £,, at T=7.5 K, from measurements at B=+12.5 mT, obtained by numerical 
differentiation of V,,,,, with respect to y, normalized by the classical value 
E.,=(B/ne)l/W=91V m1“ f, E,at T=75K, from measurments at B= +18.0 mT, with 
E.,=162 Vm". The right y axis converts the field to units of current density by 
scaling with ne/B. g, Two-dimensional map of £, taken over the boxed regionina 
at T=7.5K.h, Two-dimensional map of £, at T=75K. i,j, Calculation of the 
current density/, (normalized by/, =//W=2Am“‘iniand5.4A m‘inj), andthe 
Hall field £,/E,, based on the Boltzmann theory with values of ly, and /,. 
corresponding to the experimental dataineandf. Ini, the values used are 
Iup/W=2 and I../W=4.3, whereas forj, Iy,/W=1.4 and /../W=0.16. The 
calculated profiles are convolved with the point spread function of the SET for 
direct comparison with the experiment. The current density appears parabolic 
in both the hydrodynamic and ballistic regimes, whereas the £, profile is 
relatively flat in the ballistic regime and parabolic in the hydrodynamic regime. 


become negatively curved if [qg/Wis increased further, while in the 
Poiseuille regime E, is positively curved (Fig. 2)). 

The £, profile in Fig. 2j is calculated to best fit our measurements 
at T= 75 K (Fig. 2f) with a Knudsen number of Kn =/,,/W= 0.16. This is 
consistent with hydrodynamic electron flow in which /,, is the small- 
est length scale in the system, in agreement with previous transport 
measurements**". The/,and £, profiles calculated for these parameters 
(Fig. 2j) are similarly curved (deviation scales as (I,./W), consistent 
with equation (1)), showing that the imaged £, profile (Fig. 2f) approxi- 
mates the actual Poiseuille/, profile to within 5% (see the right y axis). 
The theoretical, profile corresponding to the T= 75 K measurement 
does not reach zeroat the walls. Extrapolating this profile to zero yields 
an estimated slip length” of l,,,, ~ 500 nm. 

Having imaged the emergence of Poiseuille flow at increased tem- 
peratures, we nowexploreits carrier density dependence. For alinearly 
dispersing spectrum, Fermi liquid theory predicts /..« £;/T?« f\n\/T? 
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Fig. 3 | Carrier density dependence of hydrodynamic electron flow profiles. 
a, [p/W for T=75 K taken from Fig. 1d, with dots indicating values of n 
corresponding to experiments in subsequent panels. Between the green dots, 
gis practically independent of n owing to the combination of phonon 

and impurity scattering. b, Comparison of magnetoresistance in units of 

the inverse transport mean free path W/1,,, where for Dirac electrons 

L,(B) = h/[2e?(11|n|)"7p,,.(B)] (where his Planck's constant), at 7=75 K for several 


(where £, is the Fermi energy), and soa variation of the flow profiles with 
nis expected. Varying n, however, will generically also change yp, 
possibly masking the relatively weak ./[n|-dependence of [,.. 
Fortunately, at elevated temperatures there is a range of n over which 
ug remains nearly constant owing tothe compensating effects of phonon 
and impurity scattering (between green dots in Fig. 3a, T= 75 K). Infact, 
the magnetoresistance at two substantially different densities (Fig. 3a, 
green dots) is nearly identical (Fig. 3b, green curves), implying that from 
transport measurements alone it is impossible to distinguish between 
electron flows at these densities (see Methods and Extended Data Fig. 6). 
However, the corresponding imaged E, profiles (Fig. 3c, green curves) 
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Fig. 4| Curvature of the imaged E, profiles and phase diagram of electron 
flow regimes. a, Normalized curvature, x, of the imaged £, profiles asa 
function of nand Tas described in the main text (data points marked by 
crosses). Dashed red lines mark the maximal curvature obtained for non- 
interacting electrons based on Boltzmann calculations, and also the curvature 
of the ideal Poiseuille flow with zero slip length. Inset, /,. at the values of nand T 
from the experiment (solid lines), determined by comparing the imaged E, 
profiles to those calculated using the Boltzmann equations (error bars 
correspond tothe standard deviation of /,., computed by least-squares fitting 
of Boltzmann calculations to experimental data). The coloured dashed lines 
are the corresponding predictions for /,. based on many-body calculations for 
monolayer graphene®. The dashed black line marks the length of the deviceL 
(normalized by W), above which the Boltzmann theory for an infinitely long 
channel canno longer predict /... b, Phase diagram of electron flowas obtained 
fromx, calculated by Boltzmann theory (colour scale) asa function of [y4,;/Wand 
1,./W. The curvature values are determined by convolving the calculated 
profiles with the point spread function of the experiment at the same finite 
magnetic fields as in the experiment (W/R,=1.3) for best comparison. The 
different electron flow regimes are labelled (ballistic, Ohmic, Poiseuille and 


78 | Nature | Vol576 | 5 December 2019 


y/W 


values of n indicated by the colour of the curve (corresponding to dotsina). 
The two greencurvesat higher |n| exhibit nearly indistinguishable 
magnetotransport.c, E,/E3 profiles imaged for the same values of nas inbas 
indicated by colour (W/R =1.3 for each curve, with B=+25.4 mT, 18.5 mT and 
12.5 mT forn=-6.2x10" cm”, -3.3 x10" cm7and -1.5 x10" cm, respectively), 
demonstrating the monotonic increase of curvature with decreasing |n|. EGis 
the bulk value for the classical Hall field, (B/ne)I/W. 


are markedly different, varying in curvature by about 50%, which reflects 
the variation in/.,.. This result again highlights that £, is a sensitive indica- 
tor for hydrodynamics. At even lower |n| (black dot, Fig. 3a) [yp drops and 
both magnetoresistance (Fig. 3b, black curve) and the £, profile (Fig. 3c, 
black curve) change as compared to higher densities. 

We nowsystematically investigate how the curvature of £, varies over 
a broader range of nand T. For eachnand 7 we image the £, profile, fit 
it to the form £,(y) = ay*+c for |y/W| < 0.3, and extract the normalized 
curvature K =—(a/c)(W/2)? (x = 0 for a flat profile and x =1 for an ideal 
parabolic Poiseuille profile, reaching zero at the walls). Figure 4a plots 
the measured x as a function of n for T=7.5 K, T= 75 Kand T=150 K. 
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porous) together with illustrations of the relevant scattering mechanisms. 
Electrons are drawnas green circles, and £, profiles are schematically drawn in 
purple. In the ballistic regime, the £, profile is flat or even negatively curved 
(the magnitude of negative curvature is limited by the nonzero magnetic field). 
Inthe Ohmic regime, electrons scatter primarily from impurities/phonons 
(drawnas crosses), and the £, profile can be gently curved. In the Poiseuille 
regime, electrons primarily scatter from other electrons, leading toa strongly 
parabolic £, profile. In the porous regime, both electron scattering from 
impurities and phonons and electron-electron scattering havea prominent 
role, resulting in an E, profile that is gently curved in the middle of the channel 
and reaches zero over a distance of the order of D, = 5 Vivrlec from the walls. 
The green lines mark the transitions between the different regimes: ballistic to 
Ohmic at [y,/W=1, transition to hydrodynamics /../W=1, and transition from 
Poiseuille to porous at D,/W~1. Inthe Poiseuille regime the profiles canreacha 
maximum curvature of x=1. The overlaid blue, purple and red paths 
correspond to the values of /y, and /,. (same error bars asin the inset of a) at 
T=7.5K, T=75 Kand T=150K, respectively, from the experimental tracesina, 
with the dots indicating the lowest density. 


At T=7.5 K we find that x is close to zero, and even becomes negative 
at high density. We further observe that the value of k monotonically 
increases with increasing 7 and decreasing |n|, with the measured 
curvature approaching the ideal Poiseuille value at the highest T7and 
lowest |n|. 

To demonstrate the relation between the curvature of the £, profiles 
and the flow regime more quantitatively, we plot in Fig. 4b a phase dia- 
gram of the flow based on x calculated using the Boltzmann theory as 
a function of the two length scales that control the physics: [yjp/W and 
|../W. The phase space is demarcated into four regions: Ohmic, ballistic, 
Poiseuille and porous, the last two of which are hydrodynamic. In the 
Ohmic regime the curvature is small, peaking when [\yp/W = 0.25. Inthe 
ballistic regime, where [yp/W> 1, Kis governed by the reciprocal sum 


-1 
(z + in) ,and even becomes negative (see Fig. 4a at T= 7.5K). Inthe 


left half of the phase diagram (/../W<1), the flow is hydrodynamic, and 
is either Poiseuille (top left) or porous (bottom left) in character. The 
transition occurs when the so-called ‘Gurzhi’ length, D, = 7 Heel > 
crosses W. Inthe porous regime (D, < W), named in analogy to water flow 
through porous media, both/,,,and /,.can be smaller than W. Here, Kis 
low as in the Ohmic regime, but electron-electron interactions cause 
asharp drop of £, at the walls. In the Poiseuille regime (D,> W), kincreases 
substantially, approaching x = 1, with the parabolic profiles of both £, 
andj, reaching zero at the walls (see Methods and Extended Data Fig. 8). 

We now quantitatively compare the imaged E, profiles at each density 
and temperature against the Boltzmann theory. Using the /y, presented 
in Fig. 1d, we fit the entire Boltzmann profiles to our imaged profiles 
to determine the /,, that gives the best match. The extracted values of 
1,. (Solid lines in the inset of Fig. 4a) are in close agreement with the 
many-body calculation for monolayer graphene” (see dashed lines 
in the inset of Fig. 4a), exhibiting the predicted decrease of /,. with 
decreasing |n| and increasing T. Note that once /,, exceeds the length 
of the channel (dashed black line) the Boltzmann calculations, which 
assume an infinite channel, lose their predictive power. Also, although 
at T=7.5Kand T=75 K the Boltzmann profiles closely match the overall 
magnitude and curvature of the imaged £, profiles, at T= 150 K, the 
best-fit profiles underestimate the imaged curvature (see Methods and 
Extended Data Fig. 9). This is probably caused by the scattering time 
approximation used in the calculation, suggesting that an improved 
microscopic understanding of electron-electron interactions is neces- 
sary to more completely understand hydrodynamics in real electronic 
systems (for example, using scattering integrals that better account 
for energy-momentum conservation in two dimensions, such as the 
long-lived odd-parity Fermi surface excitation modes proposed in 
refs. ***5), Finally, we overlay the values of [y, and /,. obtained from 
the measurements onto Fig. 4b (coloured paths correspond to the 
different temperatures, dots indicate lowest densities), showing the 
trajectories through the phase diagram explored in the experiment. 
Probing deeper into the Ohmic regime is limited, as further decreas- 
ing [yp requires low carrier densities where inhomogeneity near the 
channel edges becomes important (Angages * 10'° cm”). Reaching 
deeper into the Poiseuille regime is also problematic, as the necessary 
higher temperature induces increased phonon scattering, resulting 
inD,<W. 

Inconclusion, we have imaged electron flow through graphene chan- 
nel devices by mapping the transverse component of the Hall elec- 
tric field, which we find to be the essential element for distinguishing 
hydrodynamic from ballistic flow. With increasing temperature, we 
observe the evolution from flat ballistic profiles to curved profiles, 
producing images of Poiseuille electronic flow. Taken together with 
previous studies” “, our experiments firmly establish the existence of 
an electron liquid that flows according to a universal hydrodynamic 
description. Our approach should enable further exploration of the 
physics of strongly interacting electrons upon application to other 
materials and topologically distinct flow geometries. 
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Methods 


Device fabrication 

Scanning SET devices are fabricated using a nanoscale assembly tech- 
nique*’. The graphene/hBN devices are fabricated using electron-beam 
lithography and standard etching and nanofabrication procedures’ to 
define the channels and evaporation of Pt (see main text) and Pd/Au 
(Extended Data Fig. 5) to deposit contact electrodes. 


Measurements 

The measurements are performed on multiple graphene devices in two 
separate, home-built, variable-temperature, Attocube-based scanning 
probe microscopes. The microscopes operate in vacuum inside liquid 
helium dewars with superconducting magnets, and are mechanically 
stabilized using Newport laminar flow isolators. A local resistive sur- 
face mount device heater is used to heat the samples under study from 
T=7.5 K to T=150 K, and a DT-670-BR bare chip diode thermometer 
mounted proximally to the samples and on the same printed circuit 
boards is used for precise temperature control. The voltage imaging 
technique employed is presented in ref. ”. Voltages and currents (for 
boththe SET and sample under study) are sourced using ahome-built 
digital-to-analog converter array, and measured using a home-built, 
software-based audio-frequency lock-in amplifier consisting of 1 pV 
accurate d.c. and a.c. sources and a Femto DPLCA-200 current ampli- 
fier and NI-9239 analog-to-digital converter. The local gate voltage 
of the SET is dynamically adjusted via custom feedback electronics 
employing a least-squares regression algorithm to prevent disrup- 
tion of the SET’s working point during scanning and ensure reliable 
measurements. 

The voltage excitations applied to the graphene channels areas fol- 
lows: <4.3 mV at T=7.5K, <7.5mV at T=75K, and<15 mV at T7=150K, all 
chosen to not cause additional current heating (Extended Data Fig. 4). 
The magnetic fields applied are in the range +100 mT. 


Determination of the momentum-relaxing mean free path 

For achannel geometry of width W, as used in the experiments in this 
paper, the longitudinal resistivity, p,.., reflects both the bulk resistivity 
of the graphene as well as scattering from the walls. To isolate the con- 
tribution from the bulk resistivity and determine the momentum- 
relaxing mean free path in the bulk, /,,, we make use of the measured 
magnetoresistance. At any magnetic field we can obtain the transport 
mean free path from the measured p,, via the Drude formula for Dirac 
electrons, l,,(B) = h/[2e?(tt|n|)!?p,,(B)]. In the semiclassical regime, the 
primary influence of a perpendicular magnetic field B isto bend the 
electron trajectories into cyclotron orbits of radius R, = f ml “l Atsmall 
magnetic fields such that the skipping orbit diameter is larger than the 
channel width, |W/R,|<2, electrons can be efficiently backscattered in 
the bulk and by the walls, and thus /,,(B) contains the effects of both 
bulk and wall scattering. On the other hand, when |W/R,|>2, the back- 
scattering from the walls is highly suppressed because a cyclotron orbit 
emerging from one wall cannot reach the other wall without scattering 
at least once in the bulk. In this regime the transport mean free path is 
primarily controlled by the bulk scattering length, /,, = Jy, With asmall 
correction scaling as |W/R,|' due to the volume participation ratio of 
skipping cyclotron orbits. In fact, using Boltzmann calculations of the 
magnetoresistance we can determine the correction factor over the 
entire phase space of the experiment. Extended Data Fig. 1 shows the 
ratio, ,./lyp, calculated using Boltzmann theory (section ‘Boltzmann 
simulations of flow profiles’ in Methods), as a function of /,./Wand 
lur/W for W/R,= 3.2. By estimating the /..in our experiments using the 
£,measurements and the Boltzmann calculations as in the main text 
(inset of Fig. 4a), and using /,,as a zeroth-order estimate for [,,, wecan 
determine from Extended Data Fig. 1 the correction factor and obtain 
from our measured /,, the bulk /y,. Note that in the regions of the phase 
diagram traversed by the experiment (curves in Fig. 4b), the correction 


factor is rather small and the maximum deviation of ly, from 1, is about 
30%, so even the naive estimate, /yp = l,,, is already quite accurate. 


Diffusivity of etched channel walls in the experiment 
Understanding the nature of electron scattering from the etched walls 
of the graphene channels is essential for both establishing the possibil- 
ity of Poiseuille flow (diffusive walls are necessary for parabolic flow 
profiles), as well as for performing quantitative theoretical modelling 
of the imaging data to compare with experiment. In particular, we wish 
to know to what extent the scattering from the walls randomizes the 
momentum of anincoming electron. We quantify this property of the 
walls using a coefficient p that measures the probability of specular 
reflection, which can vary between zero and one. For perfectly specular 
walls, p =1, and electrons will simply reflect off the walls in a mirror- 
like fashion. In the other limit, for perfectly diffusive walls, p= 0, and 
the momentum of the outgoing electron is completely randomized. 

We use three different methods of increasing sophistication in order 
to extract the value of p for our channels, all of which indicate that the 
walls are strongly diffusive and pis nearly zero. To gaina basic intuition 
of the degree of diffusivity of the channel walls, we turn toa channelin 
which the walls for half ofits length have been intentionally roughened 
through lithographic patterning (Extended Data Fig. 2a). This sample 
geometry allows us to directly compare how the voltage drops along 
the two different halves of the channel. We plot the voltage drop imaged 
along the centre of the channel in Extended Data Fig. 2b. Tellingly, we 
note that the voltage drops linearly across the region spanning both 
lithographic wall patterns (dashed red line in Extended Data Fig. 2a), 
with no discernible difference between the two halves. We can thus 
conclude that the walls of the section of the channel with the straight- 
line etch pattern are essentially equally as rough as the intentionally 
roughened section, suggesting that p = 0. 

We next use the magnetoresistance data (see Fig. 1c) measured at 
T=7.5 K to estimate p. The double-peaked structure in the magne- 
toresistance is a telltale sign of ballistic transport, but is only present 
if the channel walls are diffusive, that is, p < 1. In short, the mecha- 
nism leading to the double peaks in a ballistic channel is the bending 
of electron trajectories by the field which forces them to scatter off 
the diffusive walls. At zero magnetic field, electrons traverse in straight 
lines, some of which have a shallow angle with respect to the walls, 
and in the absence of bulk scattering can have a long mean free path. 
The magnetic field bends these trajectories and forces electrons to 
hit the walls after a distance proportional to the cyclotron radius. For 
specular walls, this will not affect the resistivity. However, for diffusive 
walls this will cause extra back-scattering, leading to the double-peaked 
structure. This effect is pronounced when the bulk mean free path is 
long. However, for a short mean free path, bulk scattering will dominate 
and the double peak transforms into a single peak. At the transition 
from single- to double-peaked, the transport mean free path at zero 
field obeys the relation” /,,(B = 0) = W/(1 - p). Extended Data Fig. 2c 
plots the transport mean free path as a function of magnetic field and 
carrier density, determined from the measured p,,. data in Fig. Ic. We 
see that the double-peaked structure becomes a single, broad peak as 
dup decreases with decreasing density, |n|. Applying the above formula, 
we find that p~0+0.1as/,,~ W=4.7 pm at the transition from single- to 
double-peaked spectrum. 

Finally, we can independently estimate p from the scaling of /,,(B=0) 
as a function of density using the theoretical description for flow 
througha channelasa function of p developed by Molenkamp and de 
Jong’. We numerically solve for the fan diagram plotted in Extended 
Data Fig. 2d using the values of [yp at T= 7.5 K from experiment, which 
shows how ,,(B= 0) varies with carrier density n. The bold, red trace cor- 
responds to p=1, and is therefore identical to Jy, versus n, as expected 
for achannel with perfectly specular walls. As pis decreased from unity, 
the transport mean free path decreases and the curves level out, becom- 
ing rather flat at p = O. The bold black trace in Extended Data Fig. 2d 


corresponds to our experimentally measured /,,(B = 0) and closely 
matches, though very slightly undershoots the prediction for, p = 0. 
Although the fit to the p= 0 theory is good, the slight mismatch suggests 
that, while the channel walls in the experiment are nearly fully diffusive, 
there may be an edge scattering mechanism at play not captured by the 
simple specularity coefficient used in ref.*. Nevertheless, based onthe 
variation between curves at different p, we can estimate that for our 
channel |p| < 0.1, consistent with the above analyses. 

We also note that although the above analysis was performed for 
data taken at T=7.5 K, since we expect the diffusivity of the walls only 
to increase as the temperature is increased, the above estimates for p 
are then valid for all temperatures in our experiments. Further, while 
one might expect p to have some variation with carrier density owing 
tothe varying strength of p-njunctions near the channel walls, our data 
strongly suggest that p remains close to zero for the entire range of hole 
carrier densities explored in the experiment, because any deviation 
from zero would increase the rate of change of /,,(B = 0) with n, which 
is inconsistent with Extended Data Fig. 2d. Thus, we conclude that the 
etched walls of the graphene channels are effectively fully diffusive 
throughout the entire phase diagram of our experiment. 


Dependence of Hall field profile curvature on magnetic field 

Our method for mapping the Hall field, £,, relies on the application 
of asmall perpendicular magnetic field, B, to produce a Hall signal 
that is measurable by the scanning SET. We must then verify that this 
measurementis in the linear response regime with respect to B, namely 
that B is low enough not to alter the £, profile. Specifically, we aim to 
prove that the curvature of the £, profiles, x, which is a main observable 
in this work, is not altered by B. In Extended Data Fig. 3a, we present 
the curvature x imaged at a constant carrier density as a function of 
magnetic field at three temperatures, 7=4 K, T=75 K and T=150 K. 
The curvature is extracted as described in the main text by a parabolic 
fit to £, over the centre of the channel. 

We note two distinct regimes of how x depends on B: for W/R, > 2, 
«has a strong field dependence, whereas for W/R, < 2, K is constant 
at each temperature. In the higher-field regime for W/R, > 2, closed 
cyclotron orbits can fit within the width of the channel. This leads toa 
rich evolution of £, profiles that are no longer simply parabolic, and is 
the topic ofa future work. In the lower-field regime for W/R,<2, we see 
that the measured curvature is constant to within our measurement 
noise down to the lowest fields measured (W/R, ~ 1). Imaging closer 
to B= Ois increasingly challenging, as the signal-to-noise ratio of the 
measured Hall voltage decreases linearly with decreasing field. 
Extended Data Fig. 3b shows similar traces (x versus W/R,) calculated 
using Boltzmann equations for the values of [yp/W and l../W corre- 
sponding to the experiment. We find good correspondence between 
the Boltzmann simulations and the experiment. Most importantly, 
in the low field regime for W/R, < 2, the simulations confirm that 
xis independent of Bas observed in the experiments, and extend this 
observation downto B=0. Based on these results, the value of W/R.=1.3 
used for the £, profile imaging in the experiments in the main text is 
justified. 

Having justified experimentally and with Boltzmann simulations 
that the profiles are unperturbed in the low field regime W/R, < 2, 
wealso argue from analytic reasons why the flow profile is not expected 
tovary at low magnetic fields. Inthe hydrodynamic regime, the curvature 


=e where D, = > [IMplee is the Gurzhi length”. For low 


4Dy 2 
magnetic fields the correction to D, has the form 1- aa , where 


Wlege= AW dygtl/leee This correction goes as B’, and will be refévant only 
when R, is of the order of [.,,, which we are far from at W/R, = 1.3 and 
the values of /,, and [yz that we achieve in the experiment in the 
hydrodynamic regime. We can therefore conclude that the curva- 
ture x is not dependent on magnetic field for the parameters of our 
experiment. 


Dependence of Hall field profile curvature on voltage excitation 
In order to drive current through the graphene channel devices, we 
apply an oscillating bias voltage of amplitude V,, between the electrical 
contacts to the device. This excitation can in principle induce heat- 
ing of the electrons above the temperature of the cryostat, and as a 
result cause an increase in curvature of the Hall field profiles. While this 
effect can be used? instead of substrate heating, we avoid this approach 
here owing to the additional spurious effects it may have on the cur- 
vature. We therefore choose an excitation amplitude at each tempera- 
ture that is sufficiently low to minimally influence the curvature of the 
imaged profiles, but still high enough to enable a robust measurement. 

Extended Data Fig. 4 shows the curvature of the field profiles versus 
excitation amplitude V., applied to the graphene device for two tem- 
peratures, T= 7.5 K in the ballistic regime (blue trace) and T= 75 Kin 
the hydrodynamic, Poiseuille regime (purple trace). The curvature is 
extracted by a parabolic fit to the imaged E, Hall profile imaged across 
the channel at a fixed density and magnetic field as described in the 
main text. In the Poiseuille regime (7=75K, density n=-3.3 x10"cm”, 
W/R.=1.3), we see that the curvature (x = 0.5) is essentially independ- 
ent of the excitation at least up to V,, =11 mV, and therefore the excita- 
tion does not influence the physics of the electron flow. In the ballistic 
regime (T=7.5K,n=—-1.5 x10" cm”, W/R,=1.3), we see a clear increase 
in the curvature with increasing excitation due to electron heating. 
Still, for an excitation of V,, = 4.3 mV, Kis nearly zero and far below the 
Boltzmann limit marking the transition to hydrodynamic flow. We can 
thus safely choose such a low excitation and robustly image ballistic 
electron flow through the channel, although the specific value of k may 
still be somewhat influenced by the excitation. In the experimental 
data presented in the main text, for T= 7.5 K, the excitation across the 
graphene device is chosen such that V.,,< 4.3 mV, for T=75K, V..<7.5 mV, 
and for T=150K, V,.< 15 mV. 


Comparison of Hall field profile curvature for different devices 
We establish the consistency of our results across a set of graphene 
channel devices and scanning SET probes. The measurements in this 
work were carried out on two separate graphene device microchips, 
each imaged with a different scanning microscope and different SET. 
This allows us to compare between measurements and establish their 
lack of sensitivity to details specific to a particular graphene device or 
experimental setup. We denote the device used throughout the main 
text as device A. The additional device measured, which we denote as 
device B, is a channel with W=5 um, and L = 42 um, allowing us addi- 
tionally to rule out aspect-ratio-dependent effects (aspect ratio about 
3 for device A versus about 8 for device B). 

To most easily compare between devices, we examine the curvature 
of the Hall field profiles imaged at similar SET-graphene device separa- 
tions. We focus on the magnetic field dependence of the curvature at 
several different temperatures and densities. The results are shown 
in Extended Data Fig. 5. We compare first between measurements 
taken at T=7.5 K and n=-1.5 x 10" cm” in device A and T=4 K and 
n=-6 x10" cm” in device B. We then repeat the same comparison, 
nowat 7=75 K for both devices and n=-3.3 x 10" cm’ for device A and 
n=-1x10"cm’ for device B. The point spread function of the SET has 
asimilar influence on both devices, and the same valid channel region 
is chosen for the extraction of the curvature (|W/R,| < 0.3). 

In the low-temperature measurement, we observe a similar overall 
shape inthe W/R. <2 region. The low-field curvature in device A levels 
off at a slightly higher value than that in device B. The latter can be 
attributed to the different densities, since, as observed in Fig. 4a, at 
T=7.5 K the curvature exhibits strong density dependence. The cur- 
vatures imaged at elevated temperature closely match each other over 
the full range of magnetic fields, with small residual differences that are 
consistent with the density dependence in Fig. 4a. This indicates that 
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the hydrodynamic features observed in this work are not specific to the 
particular graphene sample or channel dimensions being measured. 


Distinguishing electron flow regime from transport 

The temperature and width dependence of the resistivity of a channel 
relates to one of the earliest predictions in the field of electron hydro- 
dynamics made by Gurzhi’. Specifically, he recognized the influence 
of wall scattering on the total current across the transition fromthe 
ballistic regime (/.., [up >W) to the Poiseuille regime (/,.< — < lyr) by 
increasing Tand thus decreasing the viscosity, while keeping momen- 
tum-relaxing collisions negligible. As the transition is made, a 
decrease in resistivity is expected, and in the Poiseuille regime, 
one is then expected to observe W scaling of the conductance. It is 
important to note, however, that for this to occur, we must maintain 
throughout the crossover the more stringent condition that /\,, is always 
much greater than W’/l,, and /,,. In an experiment on monolayer 
graphene, and with channel widths that are amenable to measurement 
with the current generation of scanning SETs, it is impossible to 
reach deeply enough into the Poiseuille regime to meet this require- 
ment, since the increase of temperature that leads to the decrease 
in/,, also leads to increased momentum-relaxing collisions with pho- 
nons (that is, at higher temperature the condition [yj > W/l,. breaks 
down). 

Inthis situation, the question of which observables are available for 
measurement becomes very important. It turns out that here the cur- 
vature in £, playsa crucial role: When one performs a Boltzmann simu- 
lation”, it can be seen that the dependence of the resistivity for the 
values of /,./W and [y,/W corresponding to our experiment is fairly 
weak, and much less informative than the dependence of the curvature 
of £, for the same values of these parameters. This is shown explicitly 
in Extended Data Fig. 6 (taken from ref.’), which plots the dependence 
of the effective scattering lengthL...~ 1/p,,.on/../W, and by extension, 
on T, for different values of [\y,/W. The coloured ellipses correspond 
tothe phase space regions reached in our experiments. While the resis- 
tivity variation over the experimentally relevant parameter 
range is weak, the curvature in £, can vary substantially. This stems 
from the fact that the curvature of the flow profile is a geometric 
quantity, which directly relates to the length scales in the problem. 
Indeed, it is possible to maintain I= fj. dy=constant for a fixed 
applied voltage while changing the curvature of j, from fully flat to fully 
parabolic. 

It must be emphasized here that the difficulty in extracting the flow 
regime from the resistivity is not asimple case of being able to measure 
the latter to greater precision, which would naively allow us to extract 
meaningful information from even a small change in the resistivity. 
This would indeed be the case if all the quantities that make up the 
total resistivity, namely [inp /5, and .., were to have a firmly under- 
stood functional dependence on the control parameters n, Tand W. 
However, this is not the casein graphene, and one can construct many 
models that would end up giving the same nearly flat form of p,,. versus 
T. Therefore, even a careful fitting of data to theory would not yield 
definitive information. 

Inthis context, we mention that by using ajudiciously selected sam- 
ple geometry, suchas in ref. °, anegative minimum in the vicinity resist- 
ance Ry of abilayer graphene sample has been observed. This minimum, 
however, is related to a crossover in the quantity /../x, where xis the 
distance from an injection contact to an adjacent probe contact, and 
the minimum can be attributed to a geometric effect which is absent 
inachannel geometry. 


Boltzmann simulations of flow profiles 

To model electron flow through the graphene channels, we employ an 
approach based on the Boltzmann equation”>** that incorporates the 
effects of both electron-impurity and electron-phonon scattering as 
well as electron-electron interactions”: 
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has two contributions: one from momentum-relaxing scattering, with 
arate 1/Typ, and one from momentum-conserving, electron-electron 
scattering, with a rate 1/T,,. This equation describes the evolution of 
the semiclassical occupation number f(r, v) for a wave packet of dynam- 
ical mass mat position rand velocity v, where n(r)=(f), is the local charge 
density, j(r) = (fv), is the local current density, (...), isthe momentum 
average, and . = = + =. For the sake of simplicity, we consider the 
case of acircular Fermi surface with v = v,-P(8), where pis the radial unit 
vector at angle 0 and v, is the Fermi velocity. Mean free paths are then 
simply defined as lypcee) = VeTmrcee: Fhe term proportional to 1/T,, is the 
simplest momentum-conserving scattering term that can be written, 
assuming that the electrons relax to a Fermi-Dirac distribution shifted 
by the drift velocity*”*°. This form allows for different rates of momen- 
tum-relaxing and momentum-conserving scattering while still being 
amenable to computation. Although first used in the context of two- 
dimensional electron gases with parabolic bands, it has also been 
applied to graphene”. 

The justification for the scattering integral in equation (3) is twofold. 
First, we note that the experiments are always firmly in the Fermi liquid 
regime (E, > k,7), where the phase space for electron-hole scattering 
is negligible. This is illustrated in Extended Data Fig. 7, which plots 
the density and temperature of the experimental £, profiles (red, blue 
and purple lines) presented in Fig. 4, along with the boundary F, = k,T 
(black curve). Above this boundary lies the ambipolar electron-hole/ 
Dirac fluid regime in which both electrons and holes are present and 
scattering between them must be considered. Our experiments lie far 
belowthis boundary inthe degenerate Fermi liquid regime, where only 
carriers of a single type are present and thus scattering is unipolar. 

Second, beyond the fact that our experiments are in the Fermi liq- 
uid regime, the primary difference between equation (3) and amore 
graphene-specific scattering integral is that in equation (3) we have 
neglected the enhancement of collinear scattering due to the linear 
spectrum, which has a logarithmic dependence on the fine structure 
constant“. However, for graphene encapsulated in hBN, the fine 
structure constant is of the order of one, and thus the enhanced 
collinear scattering may be neglected. Moreover, by definition, col- 
linear scattering mainly relaxes energy and only weakly relaxes the 
momentum direction. Since the latter plays the dominant role in how 
electrons flow through a channel, it is therefore safe to neglect this 
correction. 

We assume a sample that is of infinite length along the x axis (which 
is the direction of current flow), and of finite width Walong the y axis. 
The magnetic field is applied along the z direction. Diffuse scattering 
at the boundaries is imposed by the following boundary condition: 


Ww 


W (4) 
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where fpoundary iS a Constant that is independent of @ and that must be 
determined self-consistently. This ensures a uniform probability den- 
sity for the angle at which an outgoing electron leaves a given wall, 
as required for completely diffuse scattering. Note that fioundary = 0 at 
zero magnetic field” but is non-zero in general. More generally, one 
could consider a finite degree of specularity for boundary scattering, 
by taking 


W W 
fly=+F10<20]=plOy{y=+"5,-8) ‘foun 


Af --5,08<n)=pioyf{y=- 56] dpoanaaey (5) 


where p(@) € [0, 1]is the degree of specularity for electrons at incidence 
angle 0. Although our calculations were limited to p = 0 to match the 
experiment (see ‘Diffusivity of etched channel walls inthe experiment’ 
in Methods), we expect that adding a small amount of specularity would 
only gradually wash out both ballistic and hydrodynamic effects. 

Equation (2) is supplemented by Gauss’s law with a charge density 
given by en(x). The resulting integrodifferential equation is solved 
numerically using the method of characteristics* to invert the dif- 
ferential part of the equation, and an iterative method to solve the 
integral part. 

We emphasize that the above kinetic approach does not imply ano- 
slip boundary condition for the current. Instead, equation (5) merely 
imposes randomization of the incoming momentum under boundary 
scattering. This condition is well suited for doped graphene (‘Diffusiv- 
ity of etched channel walls in the experiment’ in Methods and ref. 7) 
and smoothly interpolates between effectively no-slip conditions in 
the hydrodynamic regime and a sizable slip length for ballistic flow. 
For this reason, the precise value of the specularity coefficient in the 
calculation does not qualitatively change the solution”. 

We also note that, importantly, the determination of the flow 
profile by means of the Boltzmann distribution function ensures 
that full information of the kinematics is retained. This includes 
exceptional trajectories in the ballistic limit. We resolve not only the 
long-lived trajectories which travel almost tangentially to the bound- 
ary, but also the boundary skipping orbits which impact the walls 
many times”. 


Relation between £, andj, in the hydrodynamic regime 

Inthe hydrodynamic regime for a channel of bulk resistivity p°\"* with 
diffusive walls, the Hall field £,(y) across the channel at weak magnetic 
field calculated using the Boltzmann kinetic equation approach??? 
is given by: 


(6) 


where 9, = B/ne is the Hall resistivity and EF, is the electric field along 
the channel. Additionally, we calculate the corresponding current 
density as: 


EF cosh(2-) 
Je = uk 1- sh(2t-) (7) 


We then note the following identity: 
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This allows us to substitute equation (8) into equation (6), and using 
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the relation p,, =“, ~~ we find: 
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Comparison of theoretical £, andj, curvature 

For along, ballistic channel, there are only two relevant parameters 
that determine the flow profile: the bulk mean free path normalized by 
the channel width /,,/Wand the specularity of the walls p. The case for 
specular walls is trivial, and the flow in the channel will be completely 
homogenous. The more interesting, experimentally relevant case is 
for diffusive walls with p = 0, where electrons flowing in the bulk are 
scattered with mean free path /,,, and electrons near the edge of the 
channel will encounter increased scattering by the diffusive walls. This 
physics alone, even without any electron-electron scattering, will 
produce a current density that varies from the bulk of the channel to 
the edges. For p= 0, the profile of j,,can then only be a function of the 
single parameter lyp/W, where for [\yp/W <« 1 the flow is Ohmic and for 
lur/W> 1 the flow is ballistic. 

For avery wide channel, the current profile should be flat, as increas- 
ing Wwhile keeping ly, fixed leads to yp/W <1, creating Ohmic flow. In 
this regime, information about the diffusive walls does not propagate 
substantially into the bulk, as the scattering length is much shorter 
than the channel width. 

Inthe extreme ballistic regime for lyp/W > ~,j, will also be flat, as an 
electron scattered at one wall will reach the other wall without scatter- 
ing, effectively transmitting information about the diffusive walls 
uniformly throughout the channel. However, for non-infinite [yp/W, 
this is no longer true, andj, will necessarily have a curved profile. This 
is the experimentally relevant regime, as most published experiments 
on ballistic channels are done with J\,,/W not much greater than 1 (we 
reach [\yp/W= 5). This is illustrated in Extended Data Fig. 8a, where we 
plot the curvature of j,.as a function of [\y,/W. The blue curve is based 
on the analytical formulas by de Jong and Molenkamp’, while the red 
curve is produced by a Monte Carlo electron billiards simulation. The 
curvature is extracted asin the main text: we fit a parabola of the form 

iy) =ay’+c to the central 60% of the channel, with curvature x =—(a/c) 

(W/2)*. We see that the curvature of j,. can be substantial even for very 
large [yp/W, exemplifying the difficulty in distinguishing hydrodynamic 
electron flow from ballistic flow based on the current profile /,. 

We further compare the phase diagram defined by the theoretical 
estimate for the curvature x of the £, profiles presented in Fig. 4b with 
the phase diagram defined by the theoretical curvature of the, cur- 
rent density profiles. This allows us to present amore complete rela- 
tion between £, andj, for W/R,=1.3 for each flow regime as a function 
Of Iyp/W and 1../W. The phase diagrams are presented side by side in 
Extended Data Fig. 8b, c. The/,, curvature phase diagram is constructed 
similarly to the £, phase diagram, fitting a parabola to the centre of the 

Jj, profiles calculated from the Boltzmann model after convolution 
with the point spread function of the SET. Examining first the right, 
non-hydrodynamic half of the phase diagram, we again note the large 
difference between the curvature in the ballistic regime of E, and/j,. 
Whereas F, can be negatively curved,j, is always positively curved, with 
high curvature throughout the ballistic regime. The crossover between 
the ballistic regime and the Ohmic regime is evident in both phase 
diagrams, although the/, curvature simply decreases from ballistic to 
Ohmic, while £, goes through a local maximum near the crossover. In 
the hydrodynamic regime, both phase diagrams are similar, with the 
curvature matching exactly in both limits of strongly Poiseuille and 
strongly porous hydrodynamic electron flow. This highlights the resto- 
ration of alocal relation between £, and/,, which leads toa convergence 
between these quantities in the hydrodynamic regime. 


Comparisons of imaged £, profiles to simulations 

To determine the best match to theory, we fit the entirety of each 
imaged E,(y) profile over the range of |y/W|< 0.3 to the profiles obtained 
from the Boltzmann calculations. As ly, is already determined inde- 
pendently from the magnetoresistance measurements, this procedure 
gives us the corresponding values of... The fit of each profile therefore 
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has to match not only the curvature, but also the overall height, which 
is related to the total conductivity. In Extended Data Fig. 9 we present 
three representative imaged E, profiles along with the best-fit Boltz- 
mann profiles (green curves). While the theory matches the imaged 
profiles at T= 7.5 K (blue curve) and T= 75 K (purple curve) in both 
curvature and height, the profile at 7=150 K (red curve) is clearly more 
curved than the best-match Boltzmann profile. As stated in the main 
text, this mismatch may be due to the relaxation time approximation 
for electron-electron interactions used in the Boltzmann calculations. 
Further theoretical developments are necessary to more completely 
explain the hydrodynamic flow profiles at higher temperatures. Still, 
we emphasize that this mismatch between theory and experiment does 
not affect the main observation in the paper, which is the observation 
of Poiseuille electron flow and its distinction from ballistic flow. 
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Extended Data Fig. 1| Relation between transport and momentum-relaxing 1,.(B) = h/[2e? (1t|n|)!p,.,(B)], and the bulk mean free path, 0,,/[y,, calculated using 
mean free path across the phase diagram of flowregimes and the electron Boltzmann theory at W/R,=3.2 for achannel with diffusive walls, as a function 
mobility. a, Boltzmann calculation of /,, versus [y,and/,.. Thetwo-dimensional —_ of /,./Wand [yp/W.b, Electron mobility 1 measured with scanning SET, 

map shows the ratio of the finite-field-transport mean free path, equivalent to the /,,, data presented in Fig. 1d. 
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Extended Data Fig. 2 | Diffusivity of etched channel walls inthe experiment. 
a, Illustration of achannel used to assess the diffusivity of etched walls by 
direct comparison to lithographically roughened walls. The walls of the left 
half of the channel are patterned witha typical straight-line pattern, whereas 
the right half is patterned with asaw-toothed pattern to introduce roughness. 
The region enclosed by the dashed box in the upper right is an AFM image of the 
etched walls. The red dashed line marks the spatial region spanning both wall 
patterns along which the potential drop is measured. b, Measured potential 
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drop along the centre of the channel (red dashed line ina). Away from the 
voltage steps at the contacts, the potential drops linearly across the device, 
with no observable change in slope when the walls transition from straight line 
to saw-tooth etches. c, Zoom of magnetoresistance data from Fig. lc plotted as 
1,,(B) for varying density. At the transition where the double peak disappears, 
1,,(B=0) = W/(1- p), allowing estimation of specularity p. d, Theoretical scaling 
of [,,(B = 0) with n for varying p superimposed with the experimental data (bold 
black line), indicating that pis nearly zero (fully diffusive walls). 
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Extended Data Fig. 3 | Dependence of Hall field profile curvature on 
magnetic field. a, Measured traces of x, extracted from £, using a fit to the 
centre of the channel, asa function of magnetic field plotted in units of 

W/R.~ B. The blue trace is measured at T=4 K and hole density of 

n=-6.06 x10" cm” on device B (see Methods and Extended Data Fig. 5). The 
orange trace is measured at 7=50 Kandn=-1.02 x10” cm” on device B, and the 
yellow trace is measured at 7=150 Kandahole density of n=-3.15 x10" cm7on 
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device A, which is the device used throughout the main text. Two distinct 
regimes are apparent: Below W/R, = 2, the curvature is nearly independent of 
W/R., whereas above it varies noticeably, acquiring large values. b, Curvature as 
a function of W/R, extracted froma Boltzmann simulation of £, as described in 
the main text. Coloured curves correspond to values of /y,and /,,. that best 
match experiment. This figure verifies that by imaging at W/R.=1.3 as inthe 
main text, the profiles are not influenced by the magnetic field. 
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Extended Data Fig. 4 | Dependence of Hall field profile curvature on voltage 
excitation. x, the normalized curvature of £,, is plotted as a function of the 
excitation amplitude V., applied between the contacts of the channel. Error 
bars correspond to standard deviation of x from the least-squares fit ofa 
parabola to the data. The blue trace shows 7=7.5 Kandn=~-1.5x10"cm7; the 
purple trace shows T= 75 K and n=-3.3 x10"cm”. This plot verifies that by 
choosing appropriate values for the excitation, as was done for the 
experiments inthe main text, electron heating effects are negligible. 
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Extended Data Fig. 5 | Comparison of Hall field profile curvature for (device B) used for similar measurements, with W=5 pmandZ =42 pm. 
different devices. a, Top, optical image of graphene device (device A) This device was measured ina separate cryostat witha different scanning 
patterned into the geometry of achannel, with W=4.7 pmandZ=15 pm, microscope and different SET. Colour differences between optical images are 
studied in the main text. Bottom, normalized curvature of £,, x, measured asa due to lighting conditions. Bottom, x versus W/R. measured for device B, 


function of W/R..b, Top, optical image of an additional graphene device showing a result highly consistent with thatina. 
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Extended Data Fig. 6 | Distinguishing electron flow regime from transport. 
The graph shows the dependence of L.-;/W, which is inversely proportional to 
the resistivity, on/,./W, for fixed values of [,/W= [yp/W. The purple- and red- 
coloured regions correspond to the parameter ranges of our experiment for 
T=75 Kand T=150K, respectively. It is evident from these curves that the 
dependence of the resistivity on/../Wis fairly weak when /y,/Wis not much 
larger than 1. Figure reprinted with permission from deJong and Molenkamp’; 
copyright 2019 by the American Physical Society. 
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Extended Data Fig. 7 | Phase diagram demonstrating that the experiment 
falls within the Fermi liquid regime. The black line shows the equality £,=k,/7, 
which separates the temperature—density plane into two distinct regimes. 
Above this line is the Dirac fluid regime in which electrons and holes are both 
present and thus electron-hole scattering must be considered. Below this line 
is the degenerate Fermi liquid regime in which only one charge carrier is 
present. The blue, purple and red lines correspond to the experiments 
presented in Fig. 4 at T=7.5K, T=75 Kand T=150K, respectively, and show that 
our experiments are categorically within the Fermi liquid regime. 
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Extended Data Fig. 8 | Curvature of ballisticj, and comparison of theoretical 
E, andj, across phase diagram. a, Curvature of ballistic current profile versus 
Ip/W. The analytic solution (blue curve) is based on deJong and Molenkamp? 
and the red curve is a Monte Carlo billiard ball simulation result. The two 
methods agree perfectly until [yy exceeds the channel length used in the 
billiard ball simulation, beyond which the solutions begin to deviate. 

b, Curvature x of £,, as in Fig. 4b, calculated by Boltzmann simulation 

(see Methods), asa function of /../W and Iyp/W for W/R.=1.3. Curvatureis 
calculated over the centre of the channel. Green lines divide the panel into flow 
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regimes asin Fig. 4b.c, Curvature x of j,, extracted from the same simulation 
asa. For/,, the curvature in the ballistic regime is essentially constant at «= 0.31 
and sothe curvature of /, is less discriminating between the hydrodynamic and 
ballistic regimes than the curvature of £,, which becomes negative. In the other 
regimes, the curvatures of, and £, are very similar, and the differences 
between them diminish as each of the length scales becomes much smaller 
than W. Inthe hydrodynamic regime the curvature saturates on the maximal 
possible value for a strictly parabolic profile, and inthe porous regime it 


follows the length scale D, = i, [Ivlce AS Expected. 
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Extended Data Fig. 9 |Comparisons of representative imaged E, profiles to scheme asin main text based on temperature (blue for 7=7.5K, purple for 


the Boltzmann simulated profiles. Boltzmann simulation profiles are plotted T=75K and red for T=150 K). The field £, is normalized as in the main text by the 
in green, whereas the experimental data is plotted with the same colour classical Hall field £,,=(B/ne)(I/W). 
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Two-dimensional semiconductors have emerged as anew class of materials for 
nanophotonics owing to their strong exciton-photon interaction’? and their ability to 


be engineered and integrated into devices’. Here we take advantage of these 
properties to engineer an efficient lasing medium based on direct-bandgap interlayer 
excitons in rotationally aligned atomically thin heterostructures*. Lasing is measured 
from a transition-metal dichalcogenide heterobilayer (WSe,-MoSe,) integrated ina 
silicon nitride grating resonator. An abrupt increase in the spatial coherence of the 
emission is observed across the lasing threshold. The work establishes interlayer 
excitons in two-dimensional heterostructures as a gain medium with spatially 
coherent lasing emission and potential for heterogeneous integration. With 
electrically tunable exciton-photon interaction strengths’ and long-range dipolar 
interactions, these interlayer excitons are promising for application as low-power, 
ultrafast lasers and modulators and for the study of many-body quantum 


phenomena’®. 


Semiconductor lasers are ubiquitous in today’s technology because 
they are compact, cover a wide range of wavelengths and allow efficient 
electrical pumping and fast electrical modulation. They are predomi- 
nantly based on traditional III-V quantum wells. To achieve lower power 
consumption, more compact size and a higher degree of integration 
with silicon, there has been tremendous effort to develop alternative 
gain materials and structures, such as nanowire lasers’, spasers® and 
photonic crystal lasers’. However, tunability, electrical pumping and 
heterogeneous integration remain as common challenges. 

Recently, monolayer transition-metal dichalcogenide crystals 
(TMDCs) have emerged as a new class of material for semiconductor 
lasers, as they are atomically thin and feature strong exciton emission’. 
Whereas lattice mismatch limits the choice of substrates for three- 
dimensional (8D) semiconductors, two-dimensional (2D) TMDCs do 
not have dangling bonds, and can be directly integrated with different 
substrates’. Previous studies have used two criteria to assess lasing in 
monolayer TMDCs: nonlinear intensity dependence, and linewidth 
reduction as a function of pump power’ ©. However, the photon flux 
appears to be below the stimulated emission threshold’. Spatial coher- 
ence—an important property for characterizing lasers—has not been 
studied. Henceit is difficult to exclude localized excitons, such as point 
defects, as the source of the observed nonlinear power dependence. 
Moreover, with only a monolayer as the gain medium, tunability is 
limited and vertical p-njunctions are not possible without contacting 
with other doped semiconductors. 

In contrast, heterostructures open the door to the engineering 
of band structures and exciton states. Spatially indirect excitons in 
heterostructures have been intensively studied*”, for they feature 


an electrically tunable static dipole with long-range dipole interac- 
tions, promising rich many-body quantum phenomena’. However, 
the reduced oscillator strength of spatially indirect excitons typically 
renders them dark and hard to access. 

Here we show that in rotationally aligned 2D WSe,—-MoSe, heterobi- 
layers integrated ona silicon nitride (SiN) cavity (Fig. 1), interlayer exci- 
tons form an efficient gain medium, supporting lasing with extended 
spatial coherence at alow population inversion density. As illustrated 
in Fig. 1b, by forming a direct bandgap between the two monolayers* 
that are less than one nanometre apart, the interlayer excitons retaina 
sufficiently large oscillator strength. With type-II band alignment, the 
heterobilayer forms a three-level system that allows efficient pumping 
through the intralayer exciton resonances followed by rapid electron 
transfer to a lower-energy empty conduction band” (Fig. Ic). Asa 
result, population inversion is readily achieved at the reduced bandgap 
while avoiding fast intralayer radiative loss of the carriers. Moreover, 
unlike some of the cavities used for monolayer exciton lasers, the cavity 
mode in our device fully covers the heterobilayer, allowing gain over 
the full area of the bilayer, and supporting extended spatial coherence 
(Fig. 1a). We observe lasing accompanied by an abrupt increase in the 
spatial coherence length as the photon occupancy exceeds unity. The 
emission intensity increases nonlinearly more than 100-fold across the 
threshold, and then continues to increase linearly with pump power 
(without saturation) up to the highest power used. Our results establish 
interlayer excitons in engineered TMDC heterobilayers as an efficient 
lasing medium, which, compared to excitons in monolayer TMDCs, 
feature electrically tunable long-range dipole interaction and oscilla- 
tor strength’, robust valley polarization”, anda type-II band alignment 
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Monolayer WSe, 


Fig. 1| Illustration of the heterobilayer/grating-cavity laser system. 

a, Schematic of the laser device, consisting of a heterobilayer ona grating 
cavity. The along-bar (cross-bar) direction and polarization are defined as x (y) 
and TE(TM) respectively. Grating cavity design parameters are the following: 
total SiN thickness (¢), SiO, thickness (d), grating thickness (A), grating period 
(A) and gap width (g). We define 6, (6,) as the azimuthal angle of the light beam 
in the x-z(y-z) plane with respect to the zaxis, as indicated by the red arrows. 
b, Illustration of the rotationally aligned heterobilayer with twist angle 8=0° 
(top), and correspondingly a direct bandgap at the K valleys (bottom). 


well-suited for electrical injection via an atomically thin bilayer p-n 
junction?°”), 

The lasing device comprises a rotationally aligned WSe,-MoSe, het- 
erobilayer placed ona SiN grating resonator, as illustrated in Fig. la. To 
form bright interlayer excitons, we accurately align the crystal axes of 
the WSe, and MoSe, monolayers to within 1° of relative rotation, as veri- 
fied by second-harmonic generation (SHG) measurements (Extended 
Data Fig. 1). Consequently, the band extrema at the K valleys of the 
two monolayers align in momentum space to form a direct bandgap 
(Fig. 1b). With type-II band alignment, carriers can be injected into the 
heterobilayers efficiently via the intralayer exciton resonance, followed 
by rapid electron transfer to the empty conduction band of MoSe, on 
a timescale of 10-100 fs (Fig. 1c)'*"’. As a result, band inversion can be 
established at the smaller, interlayer bandgap. Once separated into 
the two monolayers, radiative recombination is reduced, rendering 
long interlayer exciton lifetimes of the order of 1 ns (Extended Data 
Fig. 2). Photoluminescence (PL) measurements of the heterobilayer 
show that interlayer exciton emission is much stronger than intralayer 
emission (Fig. 2b), confirming efficient charge transfer and sufficient 
build-up of the interlayer-exciton population. Spatially resolved PL 
shows uniform emission from the interlayer (intralayer) excitons in 
the bilayer (monolayer) regions (Extended Data Fig. 3). 

The grating cavity provides optical feedback when photons are cou- 
pled to its resonances. The cavity modes are sensitive to the propaga- 
tion and the polarization directions of the electric field. We define 
the propagation (polarization) direction along the grating bar as x 
(TE) and across the bar as y (TM), as illustrated in Fig. 1a. We tune the 
grating period A, thickness A and fill factor gto obtain a high quality 
factor (Q-factor) for the TE mode and match it to the exciton resonance 
at zero in-plane wavenumber, k= 0. The heterobilayer lies directly on 
the grating where the evanescent field remains strong”. The TM cav- 
ity modes are far blue-detuned from the excitons; therefore, the TM 
exciton modes are not affected by the cavity (Extended Data Fig. 4). 

We confirm the TE-cavity modes by measuring the empty-cavity dis- 
persion with angle-resolved reflectance spectroscopy; the results agree 
well with the simulation by rigorous coupled wave analysis (RCWA), as 
shown in Fig. 2c. The TE mode Q-factor from the simulation is around 
2,000. However, the actual cavity Q-factor is presumably lower, owing 
to fabrication imperfections. From the reflectance spectra of the empty 
cavity, we estimate a Q-factor of between 500 and 680 (Fig. 2c inset), 
but the exact value is difficult to determine owing to low contrast and 
white-light noise. The PL spectral linewidth from the device corre- 
sponds toa Q-factor of around 630 (Fig. 2d). 

The heterobilayer allows efficient optical pumping through the intra- 
layer exciton resonances, which are far above the resonances of the 


Kyand Ky denote the K valleys of the MoSe, and WSe, layers, respectively. c, 
Band alignment and carrier dynamics of the heterobilayer. The heterobilayer 
has a type-II band alignment, forming a three-level system for the injected 
carriers. Intralayer excitons are excited by a pump laser in the WSe, layer (solid 
wavy line). Some electrons transfer to the lower MoSe, conduction band ona 
fast (10-100 fs) timescale (dotted line), while others recombine as intralayer 
excitons with lifetimes of 1-10 ps (dash-dotted wavy line). Without the cavity, 
the interlayer excitons (dashed line) recombine with a lifetime of the order of 
Ins (/,), and, with cavity enhancement, of the order of 100 ps (/,-). 


interlayer excitons or the cavity. With the pump laser at 1.7 eV, the PL 
from the cavity mode at k= 0 and energy F = 1.35 eV brightens rapidly 
as the pump power increases, as seen in the along-bar angle-resolved 
PL (Fig. 3a). 

Integrating over k,=+0.7 pm“ and £ = 1.352 to 1.359 eV, we obtain 
the photon occupancy /,(k= 0) after accounting for the independently 
measured collection efficiency of the optical path (see Methods). As 
I,(k= 0) approaches one, /,(k = 0) shows clearly a superlinear increase 
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Fig. 2| Properties of the heterobilayer and grating cavity. a, An optical 
microscope image of the WSe,/MoSe, heterobilayer integrated ona grating 
cavity. The red square outlines the grating region, and the red circle indicates 
the laser spot size. Inset, direction of the grating bars. b, PLspectrum fromthe 
heterobilayer. The sample was pumped with a 633-nm laser at power of 20 pW. 
The shaded boxes highlight the spectral range of interlayer (/,), MoSe, and 
WSe, exciton emission. c, TE-polarized along-bar, angle-resolved, simulated 
(left and overlaid crosses in the right) and measured (right) reflectance 
spectrum. Inset, line-cut of the normalized reflectance spectrum around 
k,=1.7 pm" (blue trace); the red line is a fit to the cavity mode. The star symbol 
marks the peak of the fitted cavity mode. d, PL spectrum (blue dots) near k,=0. 
The pump was on resonance with WSe, at a pump power of 0.1p)W. The red line 
isa Lorentzian fit, with a fitted linewidth of 2.4 meV. 
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Fig. 3 | Spectral properties of the interlayer exciton laser. a, Angle-resolved 
micro-PL spectra for the along-bar TE direction at P= 0.6 LW with overlaid 
simulated empty cavity (crosses) and cavity with bilayer (stars) dispersions. 
b, The photon occupancy (red) and linewidth (blue) of the TE emission versus 
input pump power. The emission intensity is integrated over |k,|< 0.7 pm‘, 
|k,|< 0.13 pm tand £=1.352-1.359 eV. The dot-dashed line indicates linear 
dependence, the vertical red line marks P,,, and the horizontal purple line 
indicates /,=1.c, Angle-resolved micro-PL spectra for the along-bar TM 
direction at P=10 pW. d, The pump power dependence of the TMemission 
photon occupancy (red) and linewidth (blue), integrated over |k,|<2 um", 
E=1.340-1.400 eV (open symbols) and |k,| < 0.7 pm’, F=1.352-1.359 eV (filled 
symbols). Integration over |k,| is 0.13 ym”. The error bars on the photon 
occupancy data include the shot noise and detector read noise. The error bars 
on the linewidth data correspond to the 95% confidence interval of the 
Lorentzian fit. 


with pump power, consistent with the onset of stimulated emission 
into the cavity mode (Fig. 3b). The power-dependent PL measurement 
is reproducible, as shown by a measurement performed ona different 
day (Extended Data Fig. 5). 

The pump power at the threshold of /,(k = 0) =1is P,,=0.18 pW. Con- 
sidering the typical absorption efficiency (20%) of monolayer WSe,, 
we obtain the threshold carrier density n,,=5.7 x 10" cm”, in good 
agreement with the density required for the transparency condition 
N= 8 X 10" cm (see Methods for calculations of n,,). Far above thresh- 
old, the output intensity becomes linear with pump power and does not 
saturate up to P=28P,,, the highest power used for TEmeasurements. 
The nonlinear increase of the intensity is reproduced by a simplified 
rate equation model, as described in Methods (Extended Data Fig. 6). 

Accompanying the superlinear increase in the emission intensity 
at threshold, the linewidth of the emission drops sharply, as shownin 
Fig. 3b, signifying the increase of temporal coherence. The linewidth at 
excitation powers below ~0.05 pW may be broader, but our detectors 
are not sufficiently sensitive to detect the emission. The saturation 
and slight increase of linewidth with increasing power above thresh- 
old may be due to interactions among the carriers and spatial mode 
competition”’. The sharp lasing emission decreases in intensity as we 
increase temperature, but persists up to 70 K, suggesting that lasing 
may survive at 70 K or higher (Extended Data Fig. 7). 

Instark contrast with the TE emission, TM-polarized emission is not 
coupled to the cavity mode and does not show threshold behaviour. 
The emission becomes detectable only at high pump powers. An exam- 
ple is shown in Fig. 3c for P=10 pW. The emission spreads uniformly 
in k over the numerical aperture of our collection optics and over a 
broad energy range of about 70 meV. With increasing pump power, the 
total integrated emission intensity increases sublinearly with pump 
power (Fig. 3d) and is a few times weaker than that of the TE intensity. 
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When integrated over the same small ranges of k and E near the las- 
ing mode, the TE and TM output intensities differ by several orders of 
magnitude (filled diamonds in Fig. 3b, d). In other words, while the TM 
emission is suppressed and remains broadly distributed in k and F, the 
TE emission is concentrated in ranges of energy and k that are one to 
two orders of magnitude smaller, as a result of stimulated emission. 

Toconfirm the extended coherence expected ofa laser with a 2D gain 
medium, we study the first-order spatial coherence functiong(r,, r,) 
defined as follows: 


Gr, 1) 
G%n, 1) GOH, 1p) 


(Km) = ] (1) 


Here G” is the first-order correlation function, and corresponds to: 


GH, ty) = TripE OnE O(r,)} (2) 


where Tr indicates trace, p is the density matrix operator, and EF” and 
E” are field creation and annihilation operators, respectively. 

Although spatial coherence properties have been extensively stud- 
ied in semiconductor photon lasers, exciton-polariton lasers” and 
plasmon lasers”, coherence of TMDC lasers has not been studied thus 
far, making it difficult to rule out localized excitons as a source of las- 
ing. However, the large spatial area of the grating resonator, and the 
large photon flux above threshold, allow us to investigate the spatial 
coherence of the interlayer exciton emission. First-order spatial coher- 
ence measurements were performed using a continuous wave excita- 
tion laser and a retro-reflector Michelson interferometer setup”®, where 
animage of the sample interferes with a centro-symmetrically inverted 
version of itself at the output with an intensity distribution /"(r) 
(Fig. 4a). Because of a small angle difference between the two beams, 
interference fringes are formed (Fig. 4b) that correspond to slightly 
varying path length differences zolt) at different positions r across 
the images; z, is the initial position of the retro-reflector. 

Varying the path length of the interferometer, /"(r) of each r oscil- 
lates, with the contrast of the oscillation proportional to the first-order 
spatial coherence, g(r, -1) (Fig. 4c). We thus obtain spatial maps of 
g(r, -1) (see Methods for details). Below threshold, the emission is 
too weak for g(r, -r) measurements. Near threshold, the map is rather 
noisy without a clear pattern of g(r, -r) versus r (Fig. 4d, top panel). 
Above threshold, a clear pattern emerges, showing a highg”(r, -r) near 
r= O that decays with increasing r and extends above the background 
fluctuations to about twice the laser spot size (Fig. 4e, top panel). 

To study the functional dependence of g(r, -r), we average over 
y=+0.2 um and obtain g(x, —x). As shown in the bottom panels of 
Fig. 4d, e, the decay of g(x, -x) is clearly slower above threshold than 
below threshold. The plot of g(x, -x) versus x is fitted well by a Gauss- 
ian function with the standard deviation oas a fitting parameter. From 
the fits, we obtain the coherence length A. = \/21a. As seen in Fig. 4f, 
A. increases abruptly across the threshold, from 2.38 um near threshold 
to about 5 um above threshold, confirming the formation of extended 
spatial coherence in the laser. Above threshold, the A, value remains 
largely unchanged, possibly limited by the laser spot size and carrier 
diffusion length. The A, value decreases slightly at the highest powers, 
possibly because of competition of multiple spatial modes in the 
absence of lateral confinement potentials. We note that the measured 
g(x, -x) is much lower than the actual value, owing to difficulty in 
achieving good alignment. 

Inconclusion, we have demonstrated a 2D WSe,—-MoSe, heterobilayer 
laser ona grating cavity. The injected carrier density at threshold is 
within an order of magnitude of the estimated transparency condition, 
suggesting band inversion between the MoSe, conduction band and the 
WSe, valence band as the gain mechanism. The type-II band alignment, 
resulting in charge separation and longer exciton lifetimes, may have 
facilitated the establishment of a population inversion. In addition, 
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Fig. 4 | First-order coherence of the interlayer exciton laser. a, Top, schematic 
of the Michelson interferometer setup. Bottom, illustration of centro- 
symmetrically interfered images. b, Typical interference pattern above P,, 

(20 pW).c¢, Intensity plots of single pixels (labelled as squares 1and 2 inb) asthe 
retro-reflector position is scanned over a phase of 41. d, e, Top, maps of g(r, -r) 
near P,, (0.3 pW) (d) and above P,,, (10 pW) (e). Bottom, horizontal line-cuts of 


for heterobilayers, moiré lattices are expected” *. By analogy to quan- 
tum dot lasers™, localization of the interlayer excitons ina moiré lattice 
may lead to increased phase space density in the lasing mode for the 
same carrier density, as well as reduced non-radiative loss of the trapped 
interlayer excitons, enhancing the performance of heterobilayer lasers. 

Future studies may clarify the role of moiré lattices in heterobilayer 
lasers. The present heterobilayer laser could be improved by reducing 
the inhomogeneous broadening of the gain medium via encapsula- 
tion with hexagonal boron nitride, improving the cavity Q, and reduc- 
ing mode competition with lateral confinement of the cavity modes. 
Also, different combinations of van der Waals materials in the hetero- 
bilayer could be used to create interlayer exciton lasers of different 
wavelengths. Using cavities with lateral rotational invariance would 
allowa valley-polarized interlayer exciton laser to be realized. Finally, 
electrical tuning of the oscillator strength might allow fast modulation 
of the laser, and electrical injection could be implemented via atomi- 
cally thin, bilayer p-njunctions”°”., Adiabatic electrical tuning® might 
be used to explore coherent indirect exciton gases. 
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Methods 


Sample fabrication 

To fabricate the grating cavity, we first grew a SiN film with a SiO, buffer 
on an Si substrate using low-pressure chemical vapour deposition, 
then patterned it using electron beam lithography and created the 
grating bars by plasma dry etching. The grating parameters indicated 
in Fig. la are as follows: d=1,475 nm, t=113 nm, h=100nm,A=615nm 
and g = 50 nm. The individual WSe, and MoSe, monolayers were 
mechanically exfoliated onto a SiO, substrate using PDMS (polydi- 
methylsiloxane) polymer. The exfoliated monolayers were stacked 
into a heterostructure using a high-accuracy rotational alignment 
method. First, the MoSe, was picked up with a PDMS/PPC (polypro- 
pylene carbonate) stamp under an optical microscope. Second, the 
crystal axes of MoSe, and WSe, were rotationally aligned to be 0° or 
60° before stacking. Third, the stacked heterostructure was dropped 
down onto the grating cavity. Last, the polymer residue was dissolved, 
and the sample was annealed at 350 °C for a total of 7h. 


Heterobilayer twist angle 

We can verify the twist angle of the heterobilayer aligned under the 
optical microscope by angle-dependent second-harmonic generation 
(SHG) measurements. Extended Data Fig. 1shows an optical microscope 
image of two different samples and the corresponding angle-dependent 
SHG measurements. By fitting the SHG pattern with a cos?(36) function, 
where @is the angle between the armchair direction of the monolayer 
and the polarization direction of the beam, we can obtain the twist 
angle. We did not measure the SHG for the heterobilayer before putting 
it onthe grating, but from experience, we have found that the straight 
edges of exfoliated monolayers reliably correspond to the armchair 
axis of the crystal. Therefore, we aligned the two straight flake edges 
under the optical microscope. 


Time-resolved PL 

To measure the decay time of the TM emission we used a Hamamatsu 
streak camera system. The emission was polarization-selected forthe TM 
directionand sent to the streak camera. As shown in Extended Data Fig. 2, 
aline-cut of the streak camera spectrum was fitted with a bi-exponential 
function to determine the lifetime. The fitted lifetime is around 2 ns. 


PL mapping of the heterobilayer device 

The spatially resolved PL mapping of the heterobilayer device is shown 
in Extended Data Fig. 3. For this measurement, we use a 633-nm continu- 
ous wave (CW) laser and selected TE polarization for excitation. The 
sample is mounted onan Attocube ANC 300 piezo stage and scanned 
as needed. Spectral band-pass filters are used to select the emission 
from bilayer, WSe, and MoSe, regions. 


Measurements of lasing characteristics 

Extended Data Fig. 8 shows the schematic of the optical setup used 
for the angle-resolved PL reflection and the coherence measurements 
of the heterobilayer laser device. The sample was cooled to 5 K using 
a Montana Instruments Fusion 2 cryostat. Fourier-space imaging 
was used to measure angle-resolved reflection and micro-PL of the 
device. For reflection, atungsten halogen lamp was used. For micro-PL, 
a pulsed Ti:sapphire laser (80 MHz repetition rate, 150 fs pulse width) 
near-resonant with the WSe, A-exciton (1.7 eV) was used to excite the 
sample. The emission was collected using a 0.42 NA objective lens, 
passed through a long-pass filter to filter out the excitation laser and 
a linear polarizer to selectively measure TE and TM modes, and sent 
to a Princeton Instruments spectrometer with a measured spectral 
resolution of 0.3 nm. The entrance slit of the spectrometer is aligned 
along the y direction. The slit width of 100 pm corresponds toa range 
|k,|< 0.13 pm™. The NA of the collection optics corresponds toa range 
of |k,|<2 pm". 


Spatial coherence measurement was performed using aretro-reflector 
Michelson Interferometer setup as shown in Extended Data Fig. 8. 
Emission rid of scattered pump laser light was sent to a50:50 beam 
splitter which divided the light into two paths, the mirror path and the 
retro-reflector path. In order to change the time difference (Tt) between 
the two paths, the retro-reflector is mounted ona stepper motor which 
can have astep size as small as 50 nm (about 0.167 fs). The interference 
pattern was collected by a Princeton Instruments eXcelon charge- 
coupled camera, with intensities described by: 


Pr) = 1) + (- 1) +21) (- 1) g(r, - nin 2-20) (3) 
0 


Here /(r) and /(-r) are intensities from the mirror and retro-reflector 
arms, respectively, and are measured by blocking one of the arms of 
the interferometer, zis the position of the retro-reflector, g(r, -1) is 
the first-order spatial coherence for two positions separated by 2r, and 
is proportional to the visibility of the interference fringe. To obtain the 
visibility or g(r, -r), we scan the position z of the retro-reflector and 
record the sinusoidal oscillation of /(r) versus z at each r, as shown 
in Fig. 4c. 

The emission from our ultra-compact device is necessarily weak and 
the detector efficiency at 1.35 eV is poor, hence it is difficult to simulta- 
neously achieve good time and spatial overlap of the interference signal 
with the asymmetric interferometer. With a symmetric interferometer 
built with two retro-reflectors, the alignment is much less sensitive to 
slight changes in the incident beam; therefore we are able to achieve 
good alignment of the signal path by using an axillary alignment laser 
and obtain visibility for g”(z= 0) close to 0.8 (Extended Data Fig. 9). 


Photon number 

Photon occupancy per pulse (/,(k ~ 0)) was estimated from the total 
count rate on the detector, n.. The total integration time for angle- 
resolved spectra was 90s. The two values are related by: /,=7pfn.(k=0). 
Here ~107 is the total detection efficiency of the setup, which is inde- 
pendently calibrated by replacing the sample with a laser-coupled 
single-mode fibre, p ~1is the number of k-space modes within the inte- 
grated region, and f= 80 MHzis the repetition rate of the pump laser. 


Transparency condition 

The transparency condition is defined as the number of carriers 
required for the energy difference between the quasi-Fermi levels in 
the conduction (£,.) and valence (E,,) bands to equal the lasing energy 
(E,,. + E¢y = 0). The quasi-Fermi levels are determined by the electron 
density: 


‘ 1 
Ne NJ, 1+ exple, - €f,c] dé. (4) 


Here, €,=E/kgT and €;,.=F;/kgI, E. is the conduction band edge, kg is 
the Boltzmann constant and Tis temperature. N. is the effective density 
of states in two dimensions for electrons with an effective mass m,*: 
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Solving equation. (4) for €,,. we obtain: 


Epc = nex | - 1 (6) 


The equation for valence band Fermi energy and hole carrier density 
can be written ina similar way. Assuming n=N,=N,, and using the effec- 
tive masses of K-valley electron and holes given in ref. *, we solve for 
the carrier density that satisfies the transparency condition and obtain 
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Ny =8 x 10" cm”, which isin good agreement with the threshold carrier 
density n,, =5.7 x10" cm”. 


Simplified rate equation model of the laser 
We use the following rate equations to describe the time evolution of 
interlayer exciton density Nand the photon density S in the lasing mode: 


dN nP _(1-B)N _ FBN 


dt ho Vz Tsp Tsp 7 aU ° mee 7) 
ds_ _FB,N S 
dt =f * + Fav,(N Ne)S a (8) 


Parameters in the equation are listed in Extended Data Table 1. In the het- 
erobilayer system, electron and hole transfer takes place on the subpi- 
cosecond timescale, much shorter than the interlayer exciton lifetime. 
Therefore, we considera pulsed pump Pthat creates aninitial carrier popu- 
lation of M(t=0) « P. The photon number /, is proportional to the photon 
density, and /, =1at threshold. The simulated curve of /, versus pump 
power matched the experiment well, as shown in Extended Data Fig. 6. 
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Extended Data Fig. 1| Heterobilayer twist angle. a, b, Optical image (a) and 
angle-resolved SHG measurements (b; open circles) of WSe,/MoSe, 
heterobilayers. c,d, Asa, b but fora different sample. The field of view of the 
optical images is around 60 pm. Solid lines inb, d, are fits by acos?(30) 
function, which give relative twist angles of 0.22°+1.78° for b, and 0.34° +1.5° 
ford. 
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Extended Data Fig. 2|Interlayer exciton lifetime. a, Time-resolved PL spectrum for TM emission. b, Line-cut of a near 1.38 eV. Red line is a bi-exponential fit to the 
data, witha fitted lifetime of 2 ns. 
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Extended Data Fig. 3 | Spatial mapping of PL. The normalized intensity of PL around their respective exciton peak energies were applied for each image. The 


from the device is shown asa function of position. Spectral filters centred white contours mark the regions of heterobilayer (a), MoSe, (b) and WSe, (c). 
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Extended Data Fig. 4 | Electric field profiles. Shown are simulated normalized 
electric field profiles as a function of position near the centre ofa grating 

cavity with lateral dimensions of 100 um x 100 pm.a, TE-polarized light at the 
cavity resonance at k= 0, showing strong field enhancement in the grating layer 
including at its surface where the heterobilayer is placed. b, TM-polarized light 
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at the same wavelength asa, showing negligible cavity effects. White lines 
outline different layers of the grating cavity. The corresponding enhancement 
of the exciton radiative decay rate, or the Purcell factor, is calculated to be 
around 2.4. 
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Extended Data Fig. 5 | Power-dependence reproducibility. The photon 
occupancy (red) and linewidth (blue) of TE emission from the heterobilayer 
versus input pump power, similar to that shown in Fig. 3b of the main text but 
measured ona different day to show the reproducibility of the device. The error 
bars onthe photon occupancy data include the shot noise and detector read 
noise. The error bars on the linewidth data correspond to the 95% confidence 
interval of the Lorentzian fit. 
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Extended Data Fig. 6 | Rate equation fitting. The log-log plot of photon 
occupancy versus pump power. The diamonds represent measured data 
shown in Fig. 3b of the main text, and the solid line is a rate-equation fitting. 
Details of the rate equation simulation are described in Methods. 
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Extended Data Fig. 7 | Temperature dependence. Temperature-dependent 
real space PL spectra of the sample studied in the main text. 
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Extended Data Fig. 8| Experimental setup. Schematic diagram of the optical experimental setup as described in Methods. BS, beam splitter; LP, long-pass filter; 
BP, band-pass filter; CCD, charge-coupled device. 
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Extended Data Table 1| Rate equation fitting parameters 


Parameter Definition Value 
F Purcell factor 2.35 
r Confinement factor 0.0208 
Vy Carrier injection volume 1.4 x 103 um* 
Tsp Spontaneous emission 2ns 
lifetime 
Tp Photon lifetime 0.3 ps 
n Absorption efficiency 20 % 
Ney Transparency density 8 x 10°'°cm? 
Bo Spontaneous emission 0.046 
factor 
a Absorption cross section 1.9 x 10° cm? 
Vg Group velocity 


1.5x 108m/s 


Definitions and values of the parameters used in the rate equation simulation. The values of F, 
and V, are estimates for our structure; N,, is estimated in Methods; measured values are used for 


sp Tp ANd 1; Bo and a are fitting parameters. See Methods for definition of the rate equation using 
these parameters. 


Article 


Thermoelectric performance ofa 
metastable thin-film Heusler alloy 


https://doi.org/10.1038/s41586-019-1751-9 


Received: 30 January 2018 


Accepted: 22 August 2019 


B. Hinterleitner’°, I. Knapp'”°, M. Poneder'2°, Yongpeng Shi**”, H. Miller’, G. Eguchi’, 
C. Eisenmenger-Sittner’, M. Stéger-Pollach"’, Y. Kakefuda®, N. Kawamoto®, Q. Guo®’, 
T. Baba®”’, T. Mori®”®, Sami Ullah®, Xing-Qiu Chen*“ & E. Bauer’?°* 


Published online: 13 November 2019 


Thermoelectric materials transform a thermal gradient into electricity. The efficiency 
of this process relies on three material-dependent parameters: the Seebeck 
coefficient, the electrical resistivity and the thermal conductivity, summarized in the 
thermoelectric figure of merit. A large figure of merit is beneficial for potential 
applications such as thermoelectric generators. Here we report the thermal and 
electronic properties of thin-film Heusler alloys based on Fe,V) ,Wo2Al prepared by 
magnetron sputtering. Density functional theory calculations suggest that the thin 
films are metastable states, and measurements of the power factor—the ratio of the 
Seebeck coefficient squared divided by the electrical resistivity—suggest a high 
intrinsic figure of merit for these thin films. This may arise from a large differential 
density of states at the Fermi level and a Weyl-like electron dispersion close to the 
Fermi level, which indicates a high mobility of charge carriers owing to linear crossing 
in the electronic bands. 


Thermoelectric devices are able to directly convert thermal energy 
into electrical energy. The efficiency of this process depends on the 
temperature difference between the hot (7, and the cold (7,) sides of 
the device (defining the Carnot efficiency) and the performance of the 
thermoelectric material, as expressed by the thermoelectric figure of 
merit Z7= <T, where S, pand Aare the Seebeck coefficient, the electri- 
cal resistivity and the thermal conductivity, respectively, and Tis the 
temperature at which the thermoelectric properties are measured. 
To improve the thermoelectric performance of a certain material, the 
power factor, PF = S?/p, must be increased and the thermal conductiv- 
ity, A=A,+A,,, must be reduced’ (A, and A,,, denote the electronic and 
phononic contributions to A, respectively). 

The three individual physical properties constituting the figure of 
merit are not independent from each other. Therefore, improving one 
without causing another to deteriorate is difficult or impossible. A,,(7) 
is the only quantity that can be changed freely without influencing the 
others. Thus, the most promising way to improve Z7 is the reduction 
of dimensions and dimensionalities”. In this regard, a vast quantity of 
work has been done to obtain and study thermoelectric materials with 
appropriate length scales, down to a few nanometres?. 

The focus of the present study is thin-film full-Heusler alloys depos- 
ited on Si wafers. Besides the expectation that the thermoelectric prop- 
erties will be enhanced, thin films can also be a basis for applications 
in fields such as microelectronics. 

Half- and full-Heusler systems are useful thermoelectric materials 
owing to their reasonably large PF and ZT values and the modest costs 
of the materials, as well as their chemical and mechanical long-term 


stability*>. Whereas half-Heusler alloys can be described as XYZ ter- 
nary compounds, full-Heusler systems have acomposition of the form 
X,YZ, where X and Y are (in general) transition-metal elements and Zis 
a main-group element. Depending on the specific composition, vari- 
ous ground states are possible, including semiconducting states with 
different gaps in their electronic density of states (DOS). 

We start by studying Fe,VAI. Although three metallic elements con- 
stitute aternary system, Fe,VAI exhibits a (pseudo-)gap in its electronic 
DOS near the Fermi energy £,, located near the valence-band edge, 
where some residual states lie®. Elemental substitutions in Fe, VAI ena- 
ble us to tailor the electronic properties by modifying the electronic 
DOS close to F, and creating scattering centres to minimize the lattice 
thermal conductivity. Following this strategy, we prepared a series of 
compounds of the form Fe,V,_,W,Al. 

The material studied here reveals a high thermoelectric performance, 
well above any numbers reported in the in literature so far’ ". However, 
the metastable state of these films (as demonstrated below), despite 
excellent intrinsic thermoelectric properties, might produce some 
challenges in fabricating useful thermoelectric devices. 


Crystal structure 


Figure 1a, b summarizes the X-ray diffraction pattern obtained for 
Fe,Vo.gWo.Al from both the thin-film and bulk samples. The corre- 
sponding Rietveld refinements, that is, the use of a nonlinear least- 
squares fit to minimize the differences between the entire set of 
observed X-ray peak intensities and the peaks calculated froma crystal 
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Fig. 1| Structure and phases of Fe,V, ,.Wo,.Al and Fe,V,),,W,.,Al-Sisubstrate 
from X-ray diffraction and electron microscopy.a, b, X-ray diffraction of 
Fe,Vo sWo2Al (Yous. yellow circles), alongside results froma Rietveld refinement 
(Yeaic., black), the difference between experimental and model data (blue) and 
the respective Bragg positions (red) for thin-film (a) and bulk (b) material. 

c, Electron diffraction pattern of the Heusler thin film. d, Z-contrast image of 
the Heusler material-substrate interface, showing the formation of Fe,Si 
crystals inside the substrate. The dashed lines denote the width of the Fe,Si 


model, are shown (solid lines), together with the respective Bragg 
positions (vertical lines) and the difference between the fit and the 
experimental data. 

There are distinct differences among the X-ray patterns between the 
bulk and the thin-film sample. Although the bulk sample essentially 
follows the predictions of the standard full-Heusler system (face-cen- 
tred cubic (fcc), Cu,MnAl-type structure, space group Fm3m), the film 
is characterized by the absence of a number of Bragg peaks, which 
indicates that it belongs tothe more simple W-type structure, as a sub- 
group of full-Heusler systems (body-centred cubic (bcc), space group 
Im3m). Although in the fcc structure Fe is located at the (8c) sites, 
Vand Ware located at (4b) and Al lies at (4a). If some anti-site occupa- 
tion on the (4a) and (4c) sites occurs, the X-ray intensity of the (111) 
peak weakens and even vanishes for a random disorder of V and Al. 
ACsCl-type structure results. The absence of the (111) and (200) X-ray 
peaks in the sputtered and heat-treated film is evidence of further 
intermixing of atoms onthe various lattice sites. In the thin-film sam- 
ple, a W-type structure is realized and the system changes from fcc 
tobcc*. Allatoms are equally distributed on the (2a) sites of this struc- 
ture. Formally, the lattice parameter changes from a = 5.7864 Ato 
a=2.8977 A. 


The film-substrate interface 


To derive detailed information interms of composition and morphology 
about the thin film (thickness, 1 1m), the substrate (thickness, 280 um), 
and the interface in between, we have carried out various electron- 
microscopy-based investigations on Fe,V, ,W,.Si sputtered on Si. 
The electron diffraction pattern in Fig. 1c shows the polycrystalline 
nature of the Heusler thin film. Because the thin film is deposited ona 
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interface and the vertical arrow marks the crossover from the Si substrate to 
the Heusler thin film. e, Composite image of Si (green), Fe,Si (blue), Al,O; (red) 
and Heusler film (light grey). The arrow indicates the position the data inf were 
taken. f, Lateral phase distribution along the interface between the Si substrate 
and the Heusler thin film as deduced from combined electron energy loss 
spectroscopy and EDX data. The data were taken from the site in e indicated by 
the arrow. a.u., arbitrary units; MLLS, multiple linear least squares. 


rough surface, no preferential orientation can be observed. In Fig. 1d 
ascanning transmission electron microscopy image of the interface is 
shown, using Z-contrast conditions. A diffusion zone with thickness of 
about 20 nm can be seen where Fe,Si forms. This interlayer extends 
into the Si substrate, but does not appear as a compact Fe,Si layer. 
Instead, Fe,Si forms as very weakly connected islands, typically a few 
tens of nanometres in size. 

This layer is the result of diffusion of the deposited film at elevated 
substrate temperatures and of diffusion during heat treatment of 
the sample. Fe,Si, the primary product formed in this interface, is a 
metallic ferromagnet, crystallizing either in a tetragonal or hexagonal 
crystal structure”, The metallic nature of this material, as is clear 
from band structure calculations, will probably generate only mod- 
erate thermopower values. In any case, we expect its contribution 
to the total electronic and thermal transport to be restricted to 
afew per cent of the overall measured effects, because this thin 20-nm 
interlayer corresponds to only 2% of the active thermoelectric 
Heusler alloy. 

We then used electron-energy-loss spectroscopy (EELS) to 
analyse the interface in detail. We found that an alumina layer 
with athickness of 5nmis also formed at the interface. Figure le gives 
a compositional map created from EELS measurements, showing 
Siin green, Fe,Si in blue, Al,O, in red and the Heusler phase in light 
grey. 

Acombined EELS and energy-dispersive X-ray spectroscopy (EDX) 
profile, displayed in Fig. 1f, demonstrates that Si diffuses into the Heu- 
sler thin film as well. Whereas the EELS data were fitted using a multiple 
linear least-squares fit routine, in Fig. 1f the EDX data for Si were nor- 
malized to the maximum peak height to give a better picture of the Si 
distribution at the interface. It can be seen that the Si diffused up to 
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Fig. 2| Temperature-dependent transport and thermoelectric properties of 
thin-film Fe,Vo,;Wo,Al.a, b, The temperature-dependent Seebeck coefficient 
(a) and electrical resistivity (b) of the entire composite (layer, interface and 
substrate), together with the thin-film value of Fe,V ,sWoAl, obtained from 
equation (2). Siyerand Shee are the deduced Seebeck data with and without 
Fe,Sias the interface, respectively. The corresponding data for the bulk 
material are added for comparison. c,d, The temperature-dependent power 
factor (c) and approximated figure of merit (d). PF values shown here refer to 
Scomp Of Fig. 2a. The size of the error bar inc results from an estimated error of 
5% for both the Seebeck (a) and resistivity (b) data. Z7,,,, is evaluated toa first 
approximation using the room-temperature (25 °C) thermal conductivity 
(Agi = 2.70 W m7 K 7; Aet =3.02 W m7! K). From the Wiedemann-Franz law, a 
temperature-dependent thermal conductivity is estimated, keeping /,, 
constant. The respective Z7,,,, data are indicated by open diamonds. 


20 nm into the Heusler thin film. The energy loss near edge structures 
(ELNES) (see Extended Data Fig. 1) was used to fit the phase distribu- 
tion shown in Fig. 1f. 


Thermoelectric properties 


To define the thermoelectric properties of the Fe,V).,W,.Al thin film, 
electrical resistivity, thermal conductivity and Seebeck measurements 
were carried out at Technische Universitat (TU) Wien, Vienna, Austria. 
Figure 2a, bshows the temperature-dependent Seebeck coefficient, S, 
and the electrical resistivity, p, respectively, of Fe,V,..W,.Al thin film 
annealed at 450 °C for one week. p(7) of this Heusler alloy is charac- 
terized by semiconductor-like resistivity behaviour, with decreasing 
resistivity as the temperature increases. Absolute p(7) values corre- 
spond fairly well with those derived for bulk materials with the same 
concentration of W. In addition, the overall resistivity of the thin film 
at lower temperatures is smaller by a factor of 2-3 compared to the 
resistivity of the starting material Fe,VAI. In agreement with density 


functional theory (DFT) calculations, the V-W substitution shifts the 
Fermi energy towards the band edge of the conduction band, with an 
increased charge carrier density compared to Fe,VAI. 

The Seebeck coefficient 5(7) of Fe,Vo ,W,.Al exhibits very large nega- 
tive values, indicating that electrons are the majority charge carriers 
in this system. This conclusion is supported by Hall effect data taken 
at high temperatures (details are discussed in Methods.) The largest 
values of the Seebeck coefficient occur at moderately high tempera- 
tures, similar to archetypal Bi-Te compounds. 

Tocorroborate these results, additional studies of transport proper- 
ties were done at the National Institute for Materials Science (NIMS), 
Tsukuba, Japan, using a device similar to that used at TU Wien (ZEM-3, 
ULVAC), and a different device (ZEM-2, ULVAC). These measurements 
confirm the initial results. Physical property measurement system 
(PPMS, Quantum Design) measurements based ona different data 
acquisition principle yielded some dissimilar results; the Seebeck 
coefficient obtained using that measuring technique, revealed—in 
part—even larger absolute values. 

To get acloser look and understanding of individual contributions 
of the active thermoelectric layer, the interlayer and the Sisubstrate, a 
parallel-conductance model is assumed, in which the constituent parts 
ofthe composite sample contribute individually to the observed effect. 
Interms of the electrical resistivity, the well known parallel-resistance 
model can be applied: 


1 1 
ra (1) 


Similarly, the Seebeck coefficient can be understood from 


di S:G;d; 
Stotal= = (2) 
total y; 0; d; 


where i= {layer, int, sub} denotes the active thermoelectric layer, the 
interface and the substrate, respectively. Equation (2) indicates that 
insucha distinct structure the individual Seebeck coefficients S$; con- 
tribute to the total measured coefficient in a weighted manner, cor- 
responding to the electrical conductivities o, = 1/p; and the slab 
thicknesses d, of the slabs involved. To ensure reliable results, the active 
thermoelectric layer was removed mechanically from the substrate by 
polishing, and then re-measured. The resistivities, which are 4-5 orders 
of magnitude larger than that of the Heusler film, were obtained, 
together with large negative Seebeck values (see Extended Data Fig. 3). 
Inthe absence of available data, a very large Seebeck effect is assumed 
for the interface (predominantly Fe,Si; S,,,~ 100 pV K“), together with 
avery lowelectrical resistivity (9,,, 100 4.0 cm; this may be consider- 
ably underestimated, since the Fe,Si interface layer is not a continuous 
film; instead, it seems to be made up of small Fe,Siislands, weakly con- 
nected and arranged along the interface (see Fig. 1d). Based on the 
above data and assumptions, the Seebeck coefficient of the active layer, 
Stayer, can first be derived from equation (2). Obtained in this way, 
Stayer(T) is added to Fig. 2a. As a second step, the Fe,Si interface is 
neglected in the analysis, because the condition of percolation might 
not be fulfilled. Data from this procedure are labelled as Steverift Fig. 2a. 
We note that in this second case, SE) ..(Tis almost the same as the total 
measured value, S.omp(7). The intrinsic nature of the very large ther- 
mopower values of thin-film Fe,V,,,.W, Al can be inferred froma direct 
comparison of the thin-film data evaluated here and those of the pure 
substrate. As shown in Extended Data Fig. 3, the pure substrate exhib- 
its even larger values; twice as large in a temperature range from 500 
to 800K. 

There are only minor changes compared to the experimentally 
deduced data owing to the very thin interface (compared to the active 
layer and the thick substrate), and the very large resistivity values inthe 
lower temperature range of the substrate. These data indicate that a 
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Fig. 3 | Time-dependent temperature response curve of Fe,V,.,Wo Al. The 
Fe,Vo Wo Al is deposited ona Si substrate with an additional Al top layer 

(100 nm). The data are obtained by the ultrafast laser flash method (rear-face 
heating, front-face detection) using nanosecond pulse heating. The signal data are 
enlarged around the instant of pump-laser irradiation at t= 0. Inset, the entire signal 
from one pulse to just before the next pulse. The red solid lines are a least-squares fit. 


substantial enhancement of the thermopower takes place if the Heusler 
system is sputtered on the Si substrate. Recently it has been observed 
that perovskite-based substrates boost the thermopower of cobaltite 
thin films", resulting in a >300% increase in the power factor PF of 
the cobaltite. An enormous value of the figure of merit, ZT = 2.7, was 
recently reported for cubic Ge,.,Sb, Te deposited ona Si wafer”. 

The electronic part of the thermoelectric performance is obtained 
from the power factor PF = S’/p. Putting both experimental quanti- 
ties together reveals the temperature dependence of PF, plotted in 
Fig. 2c. A narrow range where PF reaches very large values—more 
than 40 mW m'K-~is determined to lie between roughly 50 °C and 
150 °C. Such values are about ten times larger than those obtained for 
Bi-Te-based materials’*””. 

To derive the thermal conductivity of thin-film Fe,Vo,.W,,Al both 
diffusivity and effusivity measurements were carried out at room 
temperature, using an ultrafast laser flash method’ and a picosecond 
thermoreflectance measurement method”, respectively. 

Atypical transient temperature curve taken at 7=300 K is shownin 
Fig. 3. The solid red line shows the theoretical evolution of the time- 
dependent temperature 7(¢) of the front face if the rear face is heated’®. 
The least-squares fit in Fig. 3 (discussed in detail in Methods), reveals 
the thermal conductivity as determined from the measured thermal 
diffusivity to be A gi¢= 2.70 WK‘ m1. This value is more than 25% smaller 
than the respective figure derived for bulk Fe,Vo,,Wo Al (ref. 7°). 

Employing a front-heating-front-detection setup, the effusivity ecan 
be obtained’*”, yielding A..-= 3.02 WK‘ m7 (where /,,,is the thermal 
conductivity determined from the measured thermal effusivity), which 
is in very good agreement with the value derived from the thermal 
diffusivity data. Extended Data Fig. 7a-c and Methods summarize the 
experimental data and contain a thorough discussion. 

Assuming, as a first approximation, the above-indicated thermal 
conductivities at room temperature for Fe,Vo.sWo.Al, the approximated 
figure of merit Z7,,,, can be evaluated. Results are shown in Fig. 2d. 
The moderate value of A, inthe context of the very large power factor, 
enables the figure of merit to attain values around or even above 5; such 
values are well above all others reported so far in the literature. Among 
the largest values obtained recently are Z7 = 2.4 in artificial layers 
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of Bi,Te, and Sb, Te, at room temperature’, ZT = 2.6 in SnSe along the 
baxis of the unit cell®’, and Z7 = 2.5 in p-type PbTe-SrTe”’. 

Using the Wiedemann-Franz law, the data for the temperature- 
dependent A, can be assessed and added to A,,,, which is assumed to 
be constant. This procedure enables us to derive another set of Z7,,,,(7) 
values, shown by the open diamonds in Fig. 2d. Again, Z7,,,, reaches 
values near 5 around 350-400 K; the range in which Bi-Te-based sys- 
tems have their best performance. 

The small amount of active thin-film material compared to the thick- 
ness of the substrate, means care will be required when it is used in 
thermoelectric applications”. Certainly, when integrated in thin-film 
devices and sensors, using thin-substrate structures might be ben- 
eficial. 


Discussion 


The thermoelectric performance of the full-Heusler-based thin film 
deposited on a Si wafer, although unexpected, may have two driving 
forces. First, the change of crystal structure from fcc in the bulk mate- 
rial to bcc in the thin film. The bcc-type crystal structure, although 
metastable in the bulk state, becomes stabilized if the material is 
deposited as a film on the Si substrate. As a consequence, benefi- 
cial modifications of the electronic and phononic properties might 
occur, resulting in the increase in the thermoelectric figure of merit 
we observe. Second, the electronic structure is advantageous with 
respect to electronic transport in the material. In this context, the 
very large values of S(7) can, at least qualitatively, be understood from 
the large variation in the electronic DOS near the Fermi energy. Mott’s 
theory of thermopower, that is 


__ yp_1_ ANE) 
S()=- ATE » (3) 


demonstrates that the absolute Seebeck values of a system are pro- 
portional to the logarithmic derivative of the density of states M(F) 
with respect to energy at the Fermi energy F; (Ais a constant and Tis 
the temperature). The signature of the thermopower follows here from 
the signature of ON(E)/0E. 

To obtain the specific electronic DOS and the respective electronic 
structure, we performed DFT calculations by constructing 80-atom bcc- 
type supercells, with an experimental composition of Fey ;Vo »Wo osAlo2s- 

Spin-polarized calculations revealed that 25% of the Fe atoms carry 
small local spin moments of about 0.2-0.4, (where ju, is Bohr’s mag- 
neton), when doped W atoms are their nearest neighbours, with the 
shortest Fe-W bonding length about 2.46-2.49 A; the remaining Fe 
atoms and all the V, W and Al atoms do not have any local spin moment. 
As shownin Fig. 4, the majority spin-up and minority spin-down densi- 
ties do not show distinct differences. The electronic states at the Fermi 
level are dominated by contributions from Fe, V and W; the contribu- 
tions from Al are almost negligible. 

The electronic DOS exhibits an apparent and deep pseudo-gap of 
about 0.3-0.4 eV, just below the Fermi level. More importantly, at the 
Fermi level the total DOS is located at an extremely steep shoulder with 
avery large slope (Fig. 4), that is, atON(E)/OE|,_,, = 1.39 states per eV? 
for the spin-up channel and 4.54 states per eV’ for the spin-down chan- 
nel. Following equation (3), this agrees with the experimentally 
observed very large negative values of the Seebeck coefficient. The 
narrow region, where very large S(7) values are obtained, is thought 
to result from a combination of the large logarithmic derivative of the 
electronic DOS (equation (3)) and from the fact that owing to the nar- 
rowness of the energy gap, both electrons and holes are involved 
as charge carriers, because electrons are thermally excited across the 
band gap. This leads to a deviation from linearity as inferred from equa- 
tion (3) and toa reduction in the absolute values of $(7) at higher tem- 
peratures owing to an increased number of charge carriers. The latter 
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Fig. 4 | Electronic DOS of Fe,Vo,,W,,,Al. The electronic DOS is derived from the 
presently obtained lowest-energy 80-atom bcc-type supercell and is shown 
below and above the Fermi energy (where positive densities represent the 
majority spin-up channel and negative densities correspond to the minority 
spin-down channel. The total DOS and the contributions of the various atoms 
and the partial DOS from which they originate are shown by the different 
colours. The blue vertical line marks the Fermi energy, F;. 


is fairly well reflected from atwo-carrier analysis of Hall data, explained 
in detail in Methods. 

In order to explain the effect of substituted W onthe band structure 
of Fe,V)sWo Al, we built an artificial tetragonal Fe,VAI parent com- 
pound composed of two bcc-type cells (see Extended Data Fig. 9a). 
Although spin-polarized calculations show that tetragonal Fe,VAI is 
non-magnetic, a Dirac-nodal-line-like feature?” appears on its band 
structure near the Fermi level owing to botha linear band crossing and 
the negligible spin-orbit coupling. After W doping, we observed two 
distinct differences in the electronic structure of the 80-atom bcc-type 
supercell compared to the parent compound. In the first, magnetic 
moments develop at some Fe sites, as discussed above. Inthe second, 
with respect to the case without W doping, the Fermi level is upshifted, 
resulting in the notably extension of the electron pockets near thel and 
X points (Extended Data Fig.10a,b). It is mainly because the minima of 
the energies of the electronic pockets are shifting down with respect to 
the case without W doping. More importantly, the W-induced magnetic 
ordering splits the Dirac-nodal-line-like band structure, and as a result 
Weyl nodes seem to emerge. The Weyl nodes are closer to the Fermi 
level than are the Dirac nodal lines in the parent material where W is 
absent. In other words, bcc-type Fe,V) sW, Al exhibits potential Weyl- 
like fermions around the Fermi level for both the spin-up and spin-down 
channels, thereby leading to a possible profound, non-trivial topology 
of its electronic band structure. The appearance of the nearly linear 
crossings for both spin channels indicates the enhanced mobilities of 
charge carriers and constitutes the basics of topological fermions, in 
analogy with Dirac or Weyl fermions”*”°, responsible for the remark- 
able transport properties observed. In such systems, both Weyl-like 
fermions andits non-trivial surface electronic states are highly robust 
and are protected from backscattering by non-magnetic disorder and 


defects. The scenario outlined here might be related to mechanisms 
behind the thermoelectric performance of Fe,Vo,,W Al, as experimen- 
tally observed in this study. Theoretical considerations of topological 
insulators support such conceptions, and Z7 values as large as 20 have 
been proposed”®. 

As demonstrated in several recent studies” ~’ a multi-valley structure 
of electronic bands near the Fermi energy is beneficial for increased 
thermopower. Taking into account the DFT results of Extended Data 
Fig. 10a, b, there are, besides a narrow hole structure, several highly 
degenerate electronic valleys located near the Fermi energy. The 
respective charge carriers are characterized by large Fermi velocities 
and large mobilities (as discussed in detail in Methods). As a result, 
the dominant diffusion component of the Seebeck effect is also large. 

Insummary, measurements of the electrical resistivity, thermal con- 
ductivity and the Seebeck effect have revealed very large values of the 
thermoelectric figure of merit for Fe,V,.,.W, Al thin films deposited on 
a Si substrate. Electron microscopy reveals a narrow diffusion zone 
between the Heusler thin film and the Si substrate; in addition, the 
ordinary structure of Heusler alloys (fcc, Cu,MnAl-type) transforms 
toa bcc W-type structure, which is metastable in the bulk. DFT calcu- 
lations reveal a large slope of the electronic DOS at nearly the Fermi 
energy, in agreement with the large Seebeck values observed, as well 
as Weyl-like fermions similar to topological Weyl semimetals, known 
for very large charge carrier mobilities. This is a prerequisite for high- 
performing thermoelectrics. 
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Methods 


Experimental details 

Thin film samples were prepared by magnetron sputtering using a 
single target consisting of the stoichiometrically prepared bulk sample 
in the form of a disk 25 mm in diameter and 3 mm in height. The base 
pressure was 10“ Pa, the working gas was Ar with a pressure of 2 Pa 
and the distance from target to substrate was about 3 cm. An undoped 
silicon wafer with a thickness of 0.279 mm and [100] orientation 
was used as substrate (Siegert Wafer, https://www.siegertwafer. 
com/Silicon_Wafers.html). The electrical resistivity of the wafer 
was >100 QO cm at room temperature. The substrate temperatures 
were varied up to 650 °C and the Heusler alloy was deposited on the 
substrate with a deposition rate of 0.5 nms‘to create samples with 
thicknesses ranging from a few hundred nm to 3 um. To stabilize the 
crystal structure and recrystallize the amorphous state, the samples 
were heat treated for one week at temperatures ranging from 150 °C 
to 650 °C. 

The phase purity of the samples and the lattice parameter were veri- 
fied by X-ray diffraction, using Cu Ka radiation (DS5000, Siemens) and 
electron diffraction. Electron microscopy was used to obtain chemical 
information at the film—substrate interface. For this, a field emission 
gun transmission electron microscope (FEI TECNAI F20) was used. For 
chemical analysis an electron filter was attached (GATAN GIF Tridiem). 
The samples were mechanically thinned to a thickness of 10 um and 
further processed within an ion mill (GATAN PIPS). The last polish- 
ing step used 200-eV Ar’ ions to remove beam damage from the prior 
milling process. 

Transport data (the resistivity and Seebeck effect above room tem- 
perature) were taken using an ULVAC ZEM-3. The Hall effect above 
room temperature was studied using a home-made set-up based ona 
superconducting magnet (up to 12T). The electrical resistance and the 
Hall resistance were derived using the van der Pauw technique, using 
an a.c. resistance bridge (LakeShore 370). 

Thermal diffusivity was measured at room temperature using the 
ultrafast laser flash method (rear heating-front detection)’. The ther- 
moelectric film is heated by a nanosecond laser pulse through the Si 
substrate, which is transparent at the wavelength of the heating laser 
beam. Opposite to the heating beam, a time-dependent temperature 
response is detected via the thermoreflectance technique by a prob- 
ing laser beam, which is illuminated onto the Al surface (thickness, 
100 nm) that is deposited on the thermoelectric film. The ultrafast 
laser flash method is a natural extension of the laser flash method, 
which is a standard method used to measure thermal diffusivity of 
bulk materials. This method is established as one of the metrological 
standards of thermal diffusivity of thin films under metric convention 
maintained by the Bureau International des Poids et Mesures (BIPM). 
Itis considered to be the standard method to measure thermal diffusiv- 
ity of films as thin as several hundred nanometres”. A reference sample 
for the ultrafast laser flash method—680-nm TiN ona synthetic quartz 
substrate—was previously developed as a thin-film thermal conductiv- 
ity standard sample, and supplied by the National Metrology Institute 
of Japan” (see Extended Data Fig. 8). 

The thermal effusivity was obtained by applying the picosecond 
time-domain thermoreflectance technique using a NanoTR (Pico- 
Therm) and a customized thermal analysis system based on PicoTR 
(PicoTherm). The customized system enables a selective derivation 
of the thermal effusivity of the thin-film area by focusing the probe 
laser beam. Thus, one can even detect time-domain thermoreflec- 
tance signals from sample surfaces that are somewhat rough and 
that have only a narrow terrace structure. The 100-nm AI thin film 
deposited on the sample surface acts as a heat source of known areal 
heat capacity. 

Specific heat data on bulk Fe,V,,W, Al were collected froma differ- 
ential scanning calorimetry measurement (Linseis, DSC-PT10). 


Ab initio calculations 

Within the framework of DFT”, we performed calculations for struc- 
tural optimization and the electronic band structures. DFT calcula- 
tions were performed using the Vienna ab initio Simulation Package 
(VASP)**"*°, with projector augmented wave pseudo-potentials*”** 
and the generalized gradient approximation within the Perdew- 
Burke-Ernzerhof exchange-correlation functional’. The adopted 
pseudo-potentials of all elements treat semi-core valence electrons 
as valence electrons. An accurate optimization of structural parameters 
was calculated by minimizing the interionic forces below 0.0001eVA7. 
The cut-off energy for the expansion of the wave function into the plane 
waves was 400 eV. The Brillouin zone integrations were sampled with 
aresolution of 21 x 0.014 A. To analyse the effects of W doping onthe 
band structure of the supercell, we adopted the unfolding technique 
implemented in VASP*”, 

Theoretical calculations of Fe,V, .W, Al within the random bcc-type 

structure are difficult: using currently available theoretical techniques 
for DFT, itis impossible to consider the random elemental distribution 
in Fe,V,,W Al with sufficient accuracy; that is, although the periodic 
unit cells have finite lattice sites, there are too many possible atomic 
distributions of Fe, V, W and Al to sample them all. Therefore, in order 
to simulate the situation as close as possible to reality, we first con- 
structed a3 x 3 x 3 54-atom bcc supercell to investigate the lowest- 
energy configuration by considering the atomic distribution and 
nearest-neighbours among Fe, V, W and Al. The results suggest that 
the lowest-energy 54-atom configuration is one in which W tends to 
bind with nearest-neighbour V atoms in the same atomic layer and 
where the W-V layer is in between two atomic Fe layers such that each V 
and W atom hasat least one neighbouring Fe atom. All Al atoms occupy 
the same atomic layers, located between two atomic Fe layers. Based 
on this lowest-energy 54-atom configuration, and in order to consider 
more possible configurations, we additionally constructed an 80-atom 
bcc-type supercell. Within this 80-atom supercell, we varied the Al, V 
and W positions in the bcc-type Fe framework—Fe is known to have an 
inherent stable bcc ground state. In total, we constructed 45 differ- 
ent configurations; from these we found an energetically favourable 
80-atom supercell with 40 Fe, 16 V, 4 W and 20 Al atoms, a good match 
to the experimental composition of Fe,V),Wo Al. 
The film-substrate interface. The deposition of the Heusler film 
Fe,V,gWo_Al (thickness, 1 pm) on top of a cleaned (100) Si substrate 
(thickness, 279 um) forms a composite (see Fig. le, f), consisting 
of the active thermoelectric layer, the interface and the substrate. 
The interface is created by diffusion processes in both directions (that 
is, from the layer to the substrate and vice versa), because of elevated 
temperatures during the sputtering process and during the heat treat- 
ment. The emergence of products resulting from diffusion in both 
directions follow the general thermodynamic rules of phase formation. 
Respective phase diagrams” indicate that Al and Si forma eutectic 
system at 12.6 wt% Si at 577 °C. In addition, Si does not solvate any Al. 
As a consequence, Al should not diffuse into the Si substrate. On the 
other hand, Fe-Si, Si-V and Si-W constitute several binary compounds; 
details on the various phases can be found, for example, in ref. *”. 

Electron-microscopy-based studies we have carried out on the 
thin Heusler film deposited on Si indicate that there are two main 
interface structures established between Si and the Heusler film: an 
AI,O, interlayer and binary Fe,Si. While the former has a thickness of 
about 10 nm, the latter extends about 20 nm into the Si substrate 
(see Fig. le). 

ELNES of the interfacial atomic layers enables us to understand the 
chemical bonding at the interface. Extended Data Fig. laillustrates the 
oxygen K-edge and the vanadium L-edge at 532 eV and 508 eV energy 
loss, respectively. Extended Data Fig. 1b shows the iron L-edge at 708 eV 
energy loss for both Fe,Siand the Heusler thin film. The corresponding 
spectra were used to fit the phase distribution in Fig. If. 
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Thermoelectric properties. Studies of the electronic and thermal 
transport were performed on samples with about 10 mm length and 
4mm width. A scheme of sucha measurement is shown in Extended 
Data Fig. 2. This schematic drawing demonstrates that both the heat 
and the electric current is flowing from the upper to the lower part of 
the sample, constituting a parallel arrangement of three different lay- 
ers: the active thermoelectric material, the interface layer and the Si 
wafer. In order to derive individual contributions, the electrical resistivi- 
ties can be considered in terms of classical parallel circuits. However, 
the Seebeck effect requires special consideration43. Accordingly, the 
Seebeck effect obtained consists of each individual contribution 
weighted by the respective electrical conductivity o and the thick- 
ness d of the layers. 

To isolate the contribution associated with the Si substrate, both 
the Heusler film and the interface layer were removed mechanically 
by grinding; the substrate was thinned by about 5 pm. Transport data 
derived for the Sisubstrate are summarized in Extended Data Fig. 3a, b. 

The procedure to analyse the thermopower data of thin films depos- 
ited on substrates in terms of equation (2) has been applied recently 
to various systems, such as Ca,Co,0, on SrTiO; or LaAlO,”, and used 
to determine the effect of oxygen reduction in bulk SrTiO, substrates 
whena thin film was deposited in an oxygen-deficient environment“, 
in Si-Ge superlattices*, and in high-temperature superconducting 
films*®. Itis based on the assumptions behind the so-called Kohler rule, 
which develops into the Nordheim-Gorter rule. The Nordheim-Gorter 
rule is the analogue for thermopower of Matthiessen’s rule for electri- 
cal resistivity. 

There is, onthe one hand, a substantial difference (several orders of 
magnitude) between the temperature-dependent resistivities of the 
composite of Heusler film, interface and Sisubstrate compared to the Si 
substrate alone. Asa result, charge carriers predominantly pass through 
the Heusler film, dominating the electrical conductivity. On the other 
hand, such distinct differences are absent in the Seebeck effect. As is 
clear from equation (2), separating the individual thermopower contri- 
butions requires determining the respective electrical conductivities, 
o=1/p, and the respective layer thicknesses. Equation (2) shows that 
from the huge differences in conductivities (more than five orders of 
magnitude in the lower ranges of temperature), the contribution of the 
Si substrate to the overall measured effect of the composite remains 
small, even considering that the substrate is about a hundred times 
thicker than the Heusler layer. A similar argument holds for the interface 
and its contribution to the measured data: even assuming very large 
Seebeck values for metallic Fe,Si (for example, 100 pV K’) and small 
resistivities (for example, 100 LO cm), the two-orders-of-magnitude- 
smaller thickness of this interface does not overwhelmingly contribute 
to the observed Seebeck data. As a result, the deduced S(7) values of 
the Heusler film are similar to those obtained from measurements of 
the entire set-up. 

Ifthe contribution of the interlayer is neglected—the island-like struc- 
tures are only weakly connected to each another—only small changes 
in S(7) would result, as demonstrated in the main text. 

The simple overall behaviour of S(7) as described by Mott’s equation 
(equation (3)) is modified by the fact that the observed thermopower 
does not only consist of contributions froma single charge carrier type. 
Instead, owing to the relatively narrow gap in NE) next to the Fermi 
energy, both electrons and holes become actively involved, that is, 
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Here, o; and S; are the electrical conductivities and the Seebeck 

coefficients, respectively, of the different sets of charge carriers. 
Equation (4) indicates that the measured thermopower data of nar- 

row-gap semiconductors might exhibit complicated behaviours—the 


almost linear temperature dependence, as inferred from equation (3), 
can be very different, owing to the temperature-dependent change 
in the charge carrier density n as well as owing to the temperature- 
dependent electron-phonon interaction that influences the electrical 
conductivity of asystem. 

High-temperature and high-field Hall studies. The following 
subsections are based on the data obtained from additionally prepared 
thin films based on Fe,V,) ,W,,Al deposited ona Si substrate and heat- 
treated at 450 °C. 

To derive qualitative and quantitative information about the domi- 
nant charge carriers present in these thin-film Heusler systems, we 
carried out Hall effect measurements at temperatures above room 
temperature and at magnetic fields up to 11 T, employing the van der 
Pauw technique; results are shown in Extended Data Fig. 4a. Besides a 
small positive contribution for the room temperature run, all remain- 
ing Hall resistivities, p,,(B), are negative; however, they exhibit 
strong curvatures with increasing magnetic field. (Here, p,,(B) char- 
acterises an electrical resistivity measured along they direction when 
an electrical current is flowing along thex direction and the magnetic 
field B is in the z direction.) This behaviour indicates that there is no 
unique set of charge carriers—instead, different types are present, 
for example, electrons and holes, or a set of electrons with different 
mobilities. 

To account theoretically for p,,(B), a simplified two-carrier model*”“* 
is generally used, with four adjustable quantities. An improvement 
of this model” has reduced the number of fit parameters to two, and 
two experimentally obtainable quantities serve as input parameters. 

The solid lines in Extended Data Fig. 4a are least-squares fits to the 
data, which gives the temperature-dependent majority and minority 
charge carrier densities n, and n, and the (respective) mobilities 4, and 
Po, as summarized in Extended Data Fig. 4b. For the high-temperature 
range, there are two different sets of electrons: the set with higher 
charge carrier concentration exhibits lower mobilities, and the set with 
lower charge carrier concentration is characterized by very high mobil- 
ity. The temperature dependence of n,;reveals a weak minimum around 
100 °C, roughly corresponding to the extremum of the temperature- 
dependent Seebeck coefficient. Within the two-carrier model, at room 
temperature the positive low-field data requires asmall concentration 
of holes with very high mobility as charge carriers (n,=1.03 x 108 cm? 
and ft, = 5,920 cm’ V's") anda much larger content of electrons with 
reduced mobilities (n,=1.1x 10” cm? and 4, =-7.5cm’ V's"). 

Tocompare the electronic transport in both film and bulk ona more 
microscopic scale, Hall measurements of bulk Fe,V).,.Wo Al were car- 
ried out (the results are summarized in Extended Data Fig. 5). Notably, 
P,,(B) is negative and almost linear for the given field and temperature 
range, indicating that electrons are the dominant charge carriers. Using 
a standard analysis for the field-dependent Hall data yields a charge 
carrier density at 7=300 K of ny, = 5.18 x 107 cm™ and a mobility of 
Mou = 3-29 cm? V1 s+. At T= 380 K, the following values are found: 
Nou = 7.6 X 107 cm and fly, = -2.13 cm’ V' s+. The same analysis for 
the film sample with B > 0 and at T= 373 K gives Ng, = 1.95 x 107° cm? 
and [rim = -90 cm? V's; and at T= 469 K, Ng, = 3 x 107° cm? and 
Lgim = ~67 cm? Vs. These numbers indicate that the charge carrier 
density in the bulk material is about one order of magnitude larger than 
inthe film, but the mobility is substantially smaller. 

Simple arguments enable us to exclude the Si substrate as being 
responsible for the considerable differences between the film and the 
bulk data (Extended Data Figs. 4, 5). Assuming two dominant sets of 
charge carriers, one for the film (A) and one for the substrate (B), the 
observed Hall coefficient R,, can be expressed as? 
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Taking into account the electrical resistivity of the Si wafer, 

Pxxp > 108 1 cm, compared to the Heusler film resistivity (several 
hundred pO cm), it follows that the second term in the numerator of 
equation (5) tends towards zero, although the mobility of Si is two or 
three orders of magnitude larger, and the second term in denomina- 
tor is negligible compared to the first. Altogether, this comparison 
implies that the Si substrate provides only a minor contribution tothe 
Hall data, and thus R,, as obtained from the measurement is dominated 
only by the Heusler film. 
Stability and reliability of measurements and stability of the sam- 
ples. The accuracy of electronic and thermal transport can suffer from 
contact problems, and soa series of Seebeck and electrical resistivity 
measurements was carried out several times, increasing the tempera- 
ture to above 400 °C and then cooling it down to room temperature. 
This procedure enables us to confirm the measurement reliability and 
the chemical and thermal stability of the thin films. 

The results of the Seebeck coefficient measurements from these 
experiments are plotted in Extended Data Fig. 6a. The overall behaviour 
of each individual run matches the results of the other runs; although 
there are some changes in absolute values, the variance remains small. 
These runs demonstrate that in the given temperature range the behav- 
iour of the material is stable, although the distinct bcc structure is stable 
only in conjunction with the Si substrate. In addition, the various runs 
showthat there is no diffusion of the Heusler phase into the substrate 
(or vice versa), which would affect electronic transport in the films. 
Furthermore, these measurements reveal a very similar temperature 
dependence of S(7) compared to the main results we report here. 
We thus also demonstrate that thin films with similar quality can be 
repeatedly prepared in the set-up at TU Wien. 

Extended Data Fig. 6b displays the temperature-dependent electrical 
resistivity of the same sample taken during the Seebeck effect experi- 
ments. Again, the overall behaviour of p(7) remains almost unchanged; 
although slightly larger variations than seen in S(7) are present, which 
might indicate slight changes in the mechanical quality of the sample. 
Extended Data Fig. 6c represents the power factor PF = S’/p of the same 
sample as evaluated from the data of Extended Data Fig. 6a, b. Similar 
to the data reported inthe main text, the power factor is very high and 
therefore will also give very high values of ZT. 

Data presented in this section verify the chemical and thermal 
stability of the thin films based on Fe,V, .W,,Al. Furthermore, the data 
demonstrate the repeatability of the preparation technique we use to 
deposit the thin film and the soundness of our data acquisition. 
Thin-film thermal conductivity. To derive the thermal conductivity of 
thin film Fe,V,..W,.Al, the ultrafast laser flash method anda picosecond 
thermoreflectance study were carried out at room temperature. A 
typical result of the flash method is displayed in Fig. 3. 

The least-squares fit in Fig. 3 is derived by adjusting two parame- 
ters, the heat diffusion time Tgiusion = ¢/@ across the specimen and the 
cooling time constant for effusion to the Si substrate T.ooiing, Where dis 
the thickness of the sample and ais the thermal diffusivity. The heat 
diffusion time, from the rear face of the thermoelectric film to the 
Al thin-film surface, was obtained from this fit a8 Tgigusion = 1-44 * 10° s, 
with a standard deviation of 1.09 x 10” s. The cooling time constant 
is determined to be T.oo1ing = 3-63 x 10° s with a standard deviation of 
0.36 x 10° s. Using the thermal diffusivity of Al®° (a= 9.7 x 10° m’s°), 
the heat diffusion time is T,, = a?/a =1.03 x 10° s for d,,=100 nm. This 
indicates that Ta) « T.pecimen: Hence, the Al layer on top of the Heusler 
film does not substantially affect the present analysis and Qpecimen ~ 
Osjeuster = (8-49 + 0.739) x 107 m?s 1. 

The thermal conductivity is then 


A gige = ADC (6) 


where D is the density and C the volumetric heat capacity of the 
system. Using the theoretical density of Fe,V) ,W,,Al, D=7,406kg m°, 


and the heat capacity (derived experimentally), C=430Jkg‘K™, gives 
Agige= 2.70 W Km". This value is more than 25% smaller than the figure 
derived for bulk Fe,V,.,Wo Al (ref. 7°). 

Thermal conductivity values of the order of 1.5-3 W K'mthave 
been previously reported™ for Fe,VAI deposited as thin film on either 
MgaAl,0, or MgO,. In that work, the more than tenfold reduction inA(7) 
of thin film Fe, VAl compared to bulk Fe,VAI was attributed to structure 
disordering, including dislocations, atomic vacancies, stacking faults, 
and other factors introduced during the preparation of the film*!. As 
inferred from our X-ray results, there is not only disorder due tothe V or 
Wsubstitutions, but additionally all atoms inthe unit cell are randomly 
distributed at the same (2a) bcc-type lattice site. As aconsequence, the 
heat-carrying phonons are expected to be maximally scattered. Thus, 
the lattice thermal conductivity should be lower than in the bulk mate- 
rial. Disorder on all lattice sites in bcc or fcc systems, that is, random 
solid solutions of elements, is a guiding design concept in high-entropy 
alloys. These have been proven for very low thermal conductivities”. 
Finally, we note that a substantial reduction of A(7)—by 1to2 orders of 
magnitude—was found in undoped Si films (thickness, ~3 um), com- 
pared to bulk Si (ref. *). 

Extended Data Fig. 7a shows a normalized thermoreflectance signal 
(front-heating-front-detection setup) of Fe,V,,W,.Al, enabling us to 
deduce the effusivity €. A100-nm thick Al surface layer is deposited by 
magnetron sputtering. The signal has been normalized by subtracting 
the background offset component. Just after pulse laser irradiation, the 
surface temperature of the deposited Al film rapidly increases. Then, 
the surface temperature gradually decreases owing to heat diffusion 
inthe cross-plane direction. 

The entire signal, over one cycle from the heating pulse to the next 
pulse (50 ns), is shown in Extended Data Fig. 7b, c, fitted with a previ- 
ously reported analytical equation**>. This analytical method considers 
the thermal effect of the past pulses, which enables us to observe the 
single-pulse heating effect (shown by the red dotted lines in Extended 
Data Fig. 7b, c). 

The time-dependent thermoreflectance signal contains information 
about the Heusler film, the Al top layer and has a contribution from the 
boundary thermal resistance, W,. 

Typically measured cross-plane thermal conductivity values turn 
out to be frequently erroneously low, owing to the contribution of 
W,. In general, W, is considerably lower than 1x 107 m*K W". Asacon- 
sequence this contribution has to be taken into account when analys- 
ing heat transfer across anumber of thin films. Following previous 
work'®’, an analytical solution enables us to separate the boundary 
thermal resistance between the Heusler film and the Al top layer from 
the thermal effusivity of the thermoelectric film’®. A least-squares fit 
gives W,=1.6 x 10° m?K Wand €=3,100J m’s °K (refs. *), 

The effusivity € is related to the thermal conductivity A by 


e=J/DOA (7) 


From equation (7), the thermal conductivity at room tempera- 
ture is A.,= 3.02 W K' m1, in good agreement with the value derived 
from the thermal diffusivity data. Here, the density of Fe,V)..W,_Al, 
D=7,406 kg m°, and the experimentally derived heat capacity, C= 
430Jkg“K", are used. 

Extended Data Fig. 8 shows the signal for the TiN reference-stand- 
ard sample from the National Metrology Institute of Japan, measured 
by the ultrafast laser flash system used in this study. A heat diffusion 
time of 1.42 x 10” s was obtained by the same method applied to the 
Fe,VoW Al thin film. This value shows agreement to within 1.6% with 
the reference value”! r,,y = 1.397 x 10” s. Hence, the thermal diffusiv- 
ity measurements in this study are comparable to the metrological 
standard to 2%. 

DFT results. Because experimental information on specific atomic 
distributions is still lacking, we considered 45 configurations of atomic 
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occupations in the 80-atom supercell to obtain the one with the low- 
est energy. We emphasize that it is impossible to consider all possible 
configurations, owing to the random occupation of the four elemental 
constituents (Fe, V, W and Al atoms). 

To obtain the doping effects of W with respect to the 80-atom lowest- 
energy bcc-type supercell with composition Fe,V) ,Wo Al (Extended 
Data Fig. 9b), we adopted the band-unfolding technique to plot the 
electronic band structure on the basis of the parent unit cell. In this 
case, we selected bcc-type Fe,VAI as the parent material. A bcc-type 
unit cell allows only two atoms, one at the corner and the other at the 
body centre. Because Fe,VAI has three types of atoms, we built an arti- 
ficial tetragonal unit cell (shown in Extended Data Fig. 9a) to use in this 
process. Note that this tetragonal unit cell can be viewed as two bcc-type 
unit cells. We determined that this artificial tetragonal Fe,VAI is non- 
magnetic, andits electronic band structure is shown by the black curves 
in Extended Data Fig. 10a, b. Notably, the electronic band structure of 
this tetragonal Fe, VAI shows linear band crossings, marked by circles 
along different high-symmetry lines. The atomic masses of Fe, Vand Al 
are only moderately heavy; only very slight spin-orbit coupling effects 
are present. Although this artificial Fe, VAI is non-centrosymmetric, the 
linear band crossings result in the appearance of so-called topological 
Dirac nodal lines; on the k,, plane, that is, on the k,= 0 plane, perpen- 
dicular to the k, direction of the Brillouin zone of the tetragonal unit cell, 
as defined in Extended Data Fig. 9a, there exist four Dirac nodal lines. 

After W is used to dope bcc-type Fe,VAI, the lowest-energy 80-atom 
supercell was determined from our current first-principles calculations. 
The effects of W doping are twofold: local magnetic ordering of some 
Fe atoms occurs when they have nearest neighbouring W atoms and 
the total electronic valence number increases; it is higher in W thanin 
V, Fe or Al. This increase leads to the upshifting of the Fermi energy. 
This is well reflected by the electronic band structures of the 80-atom 
supercell obtained by the band-unfolding technique: in Extended Data 
Fig. 10a, b we see that around the Fermi level the electronic pockets 
shift to lower energies for both spin channels as compared to the band 
structure of the parent material (black curves); indicating the effect of 
W doping. The occurrence of magnetism from W doping induces the 
most important effect: the appearance of Weyl nodes around the Fermi 
level. The appearance of these nodes indicates that spin ordering breaks 
time-reversal symmetry, which splits the Dirac nodal lines—as seenin 
the parent Fe,VAI—into Weyl nodes. In addition, W doping displaces 
the Weyl nodes closer to the Fermi level, as shown in Extended Data 
Fig. 10a, b. 

From Extended Data Fig. 10a, b we see that the DFT-derived unfolded 
electronic band structures for the 80-atom bcc supercell of Fe,V) sW, Al 
demonstrates the coexistence of the expected multi-valley electron 
and hole states. Interestingly, there exists a hole pocket with a degen- 
eracy of eight at 0.23 eV above the Fermi level along R-A inthe Brillouin 
zone. In addition, there exist two electronic valleys observed at 0.02 eV 
and 0.07 eV above the Fermi level, along Z-R andI-R, respectively. In 
particular, the electronic valley along Z-R has a valley degeneracy of 
eight, whereas the electronic valley along -R path exhibits a valley 
degeneracy of 16. To visualize the valley pockets along Z-R, R-A and 
r-R, we have compiled their charges of the Fermi surfaces at ener- 
gies 0.11 eV, 0.11 eV, and 0.13 eV above the Fermi level, respectively 
(Extended Data Fig. 10c). These distinct valence and conduction band 
valleys deviate from the high-symmetry points, leading to high valley 
degeneracies; thus we expect the thermoelectric performance will be 
enhanced. The carrier mobilities of every type of hole and electronic 
valley were evaluated from Extended Data Fig. 10a, b. Importantly, at the 
Apoint the hole band exhibits an extremely large theoretical mobility 
(1.051m?V71s7tat 300K) andthe other four electronic valleys also have 
very high electronic mobilities (X, 0.086 m?V"'s7;M,0.545m’V71s7;G, 
0.041m’?V"'s7;R, 0.008 m’V"s“, allat 300 K). The combination of high 
degeneracies and high mobilities (as obtained from our DFT-derived 
band structures), and the multi-valley band structure potentially 


contributes toa substantial thermoelectric enhancement for bcc-type 
Fe,VosW,.Al, inline with recent reports on multi-valley thermoelectric 
systems” *’, Froman electronic structure point of view, there is onlya 
small fraction of holes contributing to the transport properties, andso 
most contributions are expected to originate from electronic valleys 
at high temperatures. 

On the basis of the Goldsmid formula*, the gap in the DOS at the 
Fermi energy can be estimated as roughly 260 meV, which is in the 
range calculated by DFT. In addition, the calculations also reveal that 
the total density at the Fermi level, M(E,) = 0.62 states per eV per atom 
within the 80-atom supercell. This would correspond toa Sommerfeld 
value of the specific heat, yprr = 1.46 mJ mol K”, in good agreement 
with experimental data derived for Fe,VAI (Y..p:= 1.5 mJ mol K~*)™”. 

A substantial enhancement of the Seebeck effect and thus of the 
thermoelectric performance in the Heusler film studied here, as being 
as aresult of recently discovered** phonon drag term in thin-film Bi,Te, 
is unlikely. That work demonstrated that the substrate can play an 
important role, guiding the phonon drag mechanism in the active film. 
However, the most prominent enhancement in Bi,Te, is observed well 
belowthe corresponding Debye temperature 6, (= 9,/10). The thermo- 
power maximum of the Fe,V,,,W, Al thin film at around 100 °C is thus 
unlikely to be caused by the phonon drag mechanism. Another recently 
noted mechanism to improve thermoelectricity is superparamagnet- 
ism” of nanosized particles dispersed in a host material. Fe,Si, which 
constitutes the interface layer between the Si substrate and the Heusler 
film, isa magnetically ordered compound witha Curie temperature of 
the order of 520 K°°. However, because Fe,Si does not forma continu- 
ous layer (it instead forms a string-like arrangement of weakly coupled 
nanosized islands), superparamagnetism might be present in this inter- 
face. Our electron microscopy study (see Fig. 2d, e), onthe other hand, 
demonstrates that the Fe,Si particles are localized and do not diffuse 
into the Heusler film. Consequently, the superparamagnetism enhance- 
ment mechanism is presumably absent in the Fe,V,.,W Al thin film. 
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Extended Data Fig. 1| Electron microscopy results at the interface between substrate and film. a, ELNES of the oxygen K-edge of the thin alumina interlayer at 
532-eV energy loss and the vanadium L-edge at 513-eV energy loss of the Heusler alloy. b, ELNES of the iron L-edge in Fe,Si and the Heusler alloy. 
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Extended Data Fig. 2| Schematic of the measurement set-up. The thin film, the interface and the Sisubstrate are shown. ATis the temperature 
difference between the electrodes, denoted as U* andU. 


Article 


102 


composite 
-100 F (substrate, interface, " 10" 
Si substrate, a 
-200 film removed @ 10° 
x 5 
2, -300 f 107 & 
(ep) ©. 
- = “2 
400 composite L 
wt | (substrate, interface, 
500 Si substrate, |1_ layer) 10°73 
film removed 
a 
-600 10" 


300 400 500 600 700 300 400 500 600 700 800 
T(K) T (K) 


Extended Data Fig. 3 | Electronic and thermoelectric transport of the composite (Heusler film, interface and Si substrate) and of the isolated Si substrate. 
a, Temperature-dependent Seebeck coefficient S. b, Temperature-dependent electrical resistivity p. 
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thin-film Fe,V, ,Wo.2Al.a, Field-dependent Hall resistivity p,, at symbols indicate the data for the majority and minority electrons, respectively; 
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in Methods. b, Temperature-dependent charge carrier densities n and mobility, respectively. The dashed lines are guides to the eye. 
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Extended Data Fig. 5 | Field-dependent Hall resistivity p,,, of bulk Fe,Vo,,Wo2Al at T=380 K and T=300K. The solid lines are least-squares fits, as explained 
in Methods. 
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Extended Data Fig. 6 | Temperature-dependent transport and 


thermoelectric properties of thin-film Fe,V) ,W,,Al. Data are taken over four 
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picosecond thermoreflectance method.a, The signal dataareenlargedaround __ inrefs.**°°. The red dotted lines inb and care corrected signals for single-pulse 
the instant of pump laser irradiation at t= 0 with the fitting curve calculated heating. The blue solid lines are least-squares fits as explained in Methods. 
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Extended Data Fig. 8 | Thermoreflectance signal of the 680-nm-thick TiN reference film deposited ona quartz substrate. The signal was obtained using the 
ultrafast laser flash method. 
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Extended Data Fig. 9 | The lattice structure of full-Heusler compounds. a, Parent material Fe,VAI. b, Parent material Fe,V) ,WoAl. Descriptions and 
interpretations are summarized in Methods. 
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Fe,Vo.3Wo2Al. c—e, Fermi surfaces of bcc Fe,Vo.,Wo2Al at 0.11 eV above the Fermi 
level on Z-R (c); on R-A (d); and at 0.13 eV above the Fermi level onI-R (e). 
Descriptions and interpretations are summarized in Methods. 


Extended Data Fig. 10 | Electronic structure of bcc Fe,VAI and Fe,Vo,.3Wo2Al. 
a,b, Electron dispersion along the high-symmetry directions of the bcc 
structure around the Fermi level for spin-up (a) and spin-down bands (b). The 
dispersion of Fe,VAI is illustrated by black solid lines; the coloured lines refer to 
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Additive manufacturing, often known as three-dimensional (3D) printing, is a process 


in which a part is built layer-by-layer and is a promising approach for creating 
components close to their final (net) shape. This process is challenging the 
dominance of conventional manufacturing processes for products with high 
complexity and low material waste’. Titanium alloys made by additive manufacturing 
have been used in applications in various industries. However, the intrinsic high 
cooling rates and high thermal gradient of the fusion-based metal additive 
manufacturing process often leads to a very fine microstructure and a tendency 
towards almost exclusively columnar grains, particularly in titanium-based alloys’. 
(Columnar grains in additively manufactured titanium components can result in 
anisotropic mechanical properties and are therefore undesirable*.) Attempts to 
optimize the processing parameters of additive manufacturing have shown that it is 
difficult to alter the conditions to promote equiaxed growth of titanium grains’. In 
contrast with other common engineering alloys such as aluminium, there is no 
commercial grain refiner for titanium that is able to effectively refine the 
microstructure. To address this challenge, here we report on the development of 
titanium-copper alloys that have a high constitutional supercooling capacity asa 
result of partitioning of the alloying element during solidification, which can override 
the negative effect of a high thermal gradient in the laser-melted region during 
additive manufacturing. Without any special process control or additional treatment, 
our as-printed titanium—copper alloy specimens have a fully equiaxed fine-grained 
microstructure. They also display promising mechanical properties, such as high 
yield strength and uniform elongation, compared to conventional alloys under similar 
processing conditions, owing to the formation of an ultrafine eutectoid 
microstructure that appears as a result of exploiting the high cooling rates and 
multiple thermal cycles of the manufacturing process. We anticipate that this 
approach will be applicable to other eutectoid-forming alloy systems, and that it will 
have applications in the aerospace and biomedical industries. 


According to Interdependence Theory’, the key factors control- 
ling grain size include: (1) AT,, the critical undercooling for nuclea- 
tion; (2) AT,;, the amount of constitutional supercooling in front of 
the growing solid that provides the nucleation undercooling; and 
(3) x,4, the average spacing between the potent nucleation particles. 
Asmall AT,, large AT; and small x, favours grain refinement. The 
rate of development of a constitutional supercooling zone is con- 
trolled by the growth restriction factor Q. Larger values of Q promote 
more nucleation. However, in additively manufactured metals, the 
dimensions of the laser-melted region, coupled with a high thermal 
gradient, considerably suppress the extent of the constitutional 
supercooling zone making it challenging to achieve a fine grain size 


in additively manufactured titanium alloys. Multiple research groups 
have explored the possibilities of adding solute elements such as 
beryllium, silicon or boron to stop epitaxial growth®. However, these 
solute elements only decrease the width of columnar grains of the 
additively manufactured titanium or only achieve a partial columnar- 
to-equiaxed transition. It hence remains an open question whether 
fully equiaxed grain structures in additively manufactured titanium 
alloys are practically achievable through conventional grain-refining 
paradigms. 

It should be noted that in previous grain-refining studies, the nor- 
malized Q value—m(k - 1), where m is the slope of the liquidus line 
and kis the solute partition coefficient—has frequently been used 
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a 100 um 


Fig.1| Additive manufacturing of Ti-6AI-4V and Ti-8.5Cu alloys. a, Optical 
micrograph of anas-printed Ti-6AI-4V alloy showing coarse columnar grains. 
b, By contrast, optical microstructures of an as-printed Ti-8.5Cu alloy show 
fine, fully equiaxed grains along the building direction under the same 
manufacturing conditions. The yellow arrows in b indicate successive layer 
boundaries approximately every 200 pm and the average prior-f grain size is 
9.6 um, measured by the linear intercept technique. Inset, an enlarged portion 
ofalocal region with ultrafine grains. c, Schematic diagram of the grain growth 
mechanism of Ti-8.5Cu and Ti-6AI-4V alloys. 7, is the profile of the 
temperature of the melt and 7; is the profile of the equilibrium liquidus 
temperature. The values of AT,<(= 7; - T,) and AT, are represented qualitatively 


to guide the choice of solute elements. However, the solubility of 
a given solute element in the B-phase titanium, which defines the 
practical maximum solute concentration, Co_max, has been neglected. 
By simply exploring binary titanium alloy phase diagrams, we note 
copper to bea promising solute, with a cy_,,,, as high as 17 wt% anda 
reasonably high m(k — 1) value of 6.5 K. This leads to an overall very 
high maximum Q value, Qyax = Co-max/M(k - 1) = 110.5 K, which far sur- 
passes that of silicon or boron®. 


b 
Multiple 
layers 


Substrate 


2um 


Fig. 2 | Scanning electron microscopy (SEM) characterization of Ti-8.5Cu 
alloy. a,b, Backscattered electron (BSE) images of the as-printed Ti-8.5Cu alloy 
from Fig. 1b showing the microstructure evolution at the first layer (indicated 
by the red spots) during the additive manufacturing process, with constant 
processing parameters. The martensite phase forms when only a single layer 
was deposited (a); fine eutectoid lamellae surrounded by hyper-eutectoid 
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by the length bars. The red dotis the centre of the previous grain, which has 
grown tothesize of the circle. The grey shapes represent the grain morphology 
for the two alloys. d, Summary of the area percentage of equiaxed grains versus 
grain size for the as-printed titanium alloys** ™. Ti-Si alloys (blue triangles) are, 
from top to bottom, Ti-0.04Si, Ti-0.19Si and Ti-0.75Si. The other titanium 
alloys (dark red diamonds) are, from left to right, Ti-6Al-2Zr-2Sn-3Mo-1Cr- 
2Nb, Ti-6.5AI-3.5Mo-1.5Zr-0.3Si and Ti-3Al-10V-2Fe. Most as-printed 
titanium alloys have either fully columnar or mixed columnar and equiaxed 
prior-B grains and the grain sizes are in the range of 10O pmto1mm. This work 
shows that fully equiaxed prior-B grains can be achieved throughout the as- 
printed samples. Error bars represent one standard deviation. 


In addition to its potential for refining B-phase titanium grains, cop- 
peris also atypical eutectoid-forming element in titanium binary alloy 
systems where B > a + Ti,Cu at 792 °C. Because copper diffuses rapidly 
in titanium, this eutectoid reaction cannot easily be prevented from 
occurring even after water quenching’. Such characteristics are ben- 
eficial to the high cooling rates during additive manufacturing and 
are likely to produce a very fine eutectoid microstructure, improving 
both the strength and ductility of as-printed specimens. Therefore, 


Eutectoid, T = 792 °C 


uooedlp Bulping 


a 
SS 85.04 Ti,Cu 


Temperature 


Multiple layers 


Single layer 
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Ti,Cu particles form when multiple layers were deposited (b).c, Aschematic 
continuous cooling transformation diagram illustrates different solid-solid 
phase transformation pathways for laser deposition of the first layer and the 
successive layers. Heat accumulates during the deposition of successive layers, 
thus the cooling rate is reduced and the B > a+ Ti,Cu reaction is complete 
before the martensite transformation temperature (M,) is reached. 
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Fig.3| Transmission electron microscopy characterization of as-printed 
Ti-8.5Cu alloy. a, Bright-field image showing the ultrafine eutectoid lamellar 
structure andasmall portion of hyper-eutectoid Ti,Cu particles close to the 
prior-B grain boundaries. b-d, X-ray energy dispersive spectroscopy (XEDS) 
mapping onasection of the eutectoid lamellar structure: high-angle annular 
dark-field scanning transmission electron microscopy image (b), 

titanium elemental map (c) and copper elemental map (d). XEDS point analyses 
show that the copper contents in the lamellar structure are 2.8 wt% in a-phase 
titanium and 39.1 wt% in Ti,Cu. Under equilibrium conditions, the maximum 
solubility of copper in a-phase titanium is 2.0 wt% and in Ti,Cu it is 39.9 wt%7). 


in the present study, we aim to develop additively manufactured 
titanium-copper alloys (Extended Data Fig. 1) to form fully equiaxed 
B-phase titanium grains and an ultrafine eutectoid microstructureina 
one-step process. 

The optical micrographs of the as-printed (see Methods) Ti-8.5Cu 
specimen (herein, we use weight per cent unless otherwise specified) 
show fully equiaxed prior-B grains ( primary Ti grains that form during 
solidification, as shown in Fig. 1b) without any noticeable cracks and 
witha small volume fraction of enclosed porosity (see Extended Data 
Fig. 2). The as-printed specimen also has excellent chemical homo- 
geneity along the building direction (see Extended Data Fig. 3). The 
prior-B grains have a bimodal distribution with an average grain size 
of 9.6 um. In comparison, the microstructure of as-printed Ti-6Al-4V 
alloy is dominated by coarse columnar grains (Fig. 1a) under the same 
laser processing conditions. It can be seen that the addition of copper 
has not only fully converted the columnar grains to equiaxed grains 
but also refined the prior-B grains by two orders of magnitude. The 
commonly observed epitaxial growth is also completely eliminated, 
indicated by the size of the equiaxed grains, which is much smaller 
than the layer thickness of about 200 um (yellow arrows in Fig. 1b). 
It is also worth noting that compared with other additively manu- 
factured titanium alloys reported thus far°*», our current work has 
produced the smallest equiaxed prior-f titanium alloy grains made 
by additive manufacturing, as shown in Fig. 1d. The grain-refining 
efficiency of the as-printed titanium-—copper alloys stems from the 
high capacity of the copper solute to establish a sufficiently large 
constitutional supercooling zone in front of the solid-liquid interface, 
which is formed when the solute copper segregates around the first 
B-phase titanium dendritic grain (Fig. 1c); the Qvalue of the Ti-8.5Cu 
alloy is 62 K. By contrast, in Ti-6AI-4V, the Al and V solutes provide 
negligible constitutional supercooling (that is, Q= 8 K), which is far 
less than the nucleation undercooling AT, during solidification. Asa 


result, wide columnar grains with an average width of 120 pm growin 
the Ti-6Al-4V alloy, but fine equiaxed grains of average dimension 
9.6 um grow in the Ti-8.5Cu alloy. The constitutional supercooling, 
AT,,, is proportional to the Q value” through the dimensionless super- 
saturation parameter, Q: 


ATc¢s= QQ (1) 


This means that the constitutional supercooling zone is eight times 
greater in magnitude during additive manufacturing of Ti-8.5Cu 
compared to Ti-6AI-4V, subjected to the same laser processing 
conditions. Sufficient constitutional supercooling can efficiently 
offset the negative impact of a high thermal gradient and ensures 
that waves of heterogenous nucleation events can be triggered in 
the constitutional supercooling zone and a complete columnar-to- 
equiaxed transition can be achieved. By Interdependence Theory, 
the grain size is also dependent on Q. More copper solute delivers 
higher constitutional supercooling faster, and therefore the size of 
the equiaxed prior-B grain is reduced with increasing copper content 
(see Extended Data Fig. 4). 

Itis worth mentioning that the Scheil—Gulliver solidification path 
and freezing range are often used to predict the likelihood of crack- 
ing during solidification”. A large freezing range usually leads to less 
liquid being available for interdendritic feeding during the last stage 
of solidification. In this study, Scheil curves show a large freezing 
range of more than 500 K (Extended Data Fig. 5, dashed line) based 
on the titanium-copper equilibrium phase diagram. However, no 
cracks in the as-printed titanium—copper specimens were observed. 
This can beat least partially explained by equation (1). As the required 
constitutional supercooling critical temperature for the columnar- 
to-equiaxed transition is usually between 10K to 10K, the resultant 
supersaturation Q is much less than 1. This means that heterogene- 
ous nucleation events occur very early during solidification. The 
formation of fine equiaxed dendrites can effectively decrease the 
hot-tearing susceptibility, as validated in previous studies of cast 
alloys’. 

Upon completion of liquid-to-B-phase solidification, the B-phase 
of titanium (a body-centred cubic structure) can decompose into 
different product phases in the subsequent solid-solid phase trans- 
formations subject to the cooling rate”. A high cooling rate can restrict 
the diffusion of atoms, which suppresses eutectoid coupled growth, 
resulting in martensite (a’-phase titanium, hexagonal close-packed 
structure) formation”’. Martensite in titanium alloys can lead to higher 
strength but lower ductility’. As expected, acicular plates of martensite 
(Fig. 2a) were observed as a result of the high cooling rate in the single 
track of the additively manufactured Ti-8.5Cu alloy; however, suc- 
cessive layer-by-layer fabrication leads to multiple thermal cycles 
above and below the eutectoid reaction temperature (792 °C) in the 
previously deposited layer and thus the cooling rate of the B-phase 
decomposition decreases as the number of layers increases, owing 
to insufficient heat dissipation (see Fig. 2c). This characteristic ther- 
mal history can efficiently reverse the martensitic transformation 
and results in ultrafine eutectoid lamellae (Fig. 2b and Extended Data 
Fig. 6). Similar phenomena have been observed in other compositions 
as well (see Extended Data Fig. 7). Moreover, the average interlamellar 
spacing in the as-printed Ti-8.5Cu alloy is 46 nm+7nm (Fig. 2b), which 
is much finer than conventionally manufactured water-quenched 
(about 150 nm) and furnace-cooled (about 1 zm) samples’. This is 
because the interlamellar spacing is controlled by the diffusion length 
of the copper atoms; the diffusion length is considerably restricted 
by fast cooling. 

Titanium alloys, in general, havea very low thermal conductivity”, 
<16 W m7 K“, which may lead to interlamellar spacing coarsening 
from the surface to the core, owing to the variation in cooling rate 
during a conventional normalizing heat treatment for large, bulky 
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Fig. 4|Mechanical properties of as-printed Ti-Cu alloys. a, Representative 
engineering stress-strain curves of the as-printed materials in this study; error 
bars represent one standard deviation. b, Yield strength (0.2% offset) versus 
tensile elongation to failure for Ti-Cu alloys manufactured by different 
methods”**®; the properties of these alloys are comparable with those of the 


titanium-copper components. By contrast, the laser metal deposi- 
tion process enables relatively constant cooling rates across the alloy, 
leading to a more uniform microstructure regardless of the size of 
the specimen. Only a slight increase in interlamellar spacing from 
the bottom (41nm+5nm) tothe top (54nm+9nm) of the specimen 
was observed (errors represent one standard deviation). This is a 
result of the probably decreased cooling rate along the building 
direction. It is also worth mentioning that the copper concentration 
in the eutectoid lamellae (Fig. 3b-d) deviates from the equilibrium 
composition. The a-phase titanium contains 2.8 wt% copper and it 
is supersaturated, because the maximum solid solubility of copper 
in a-phase titanium is 2.0 wt% at equilibrium. This indicates that a 
more substantial precipitation hardening effect could be achieved 
to further increase the tensile strength through optimized post heat 
treatment. 

Tensile tests with subsized ASTM standard specimens were per- 
formed ontheas-printed alloys and the associated 0.2% offset yield 
strength (0,), ultimate tensile strength, and uniform elongation (e) 
are summarized in Table 1. Comparing the Ti-6.5Cu and Ti-3.5Cu 
alloys, the eutectoid lamellae in Ti-6.5Cu increases the strength 
substantially but decreases the ductility (see Fig. 4a). Comparing 
the Ti-8.5Cu and Ti-6.5Cu alloys, Ti-8.5Cu has higher strength 
because of the higher volume fraction of eutectoid lamellae, 
but lower ductility owing to the hyper-eutectoid Ti,Cu particles” 


Table 1| Mechanical properties of as-printed Ti-Cu alloys 


Samples o, (MPa) UTS (MPa) &(%) 
Ti-3.5Cu 747 +7 867+8 14.9+1.9 
Ti-6.5Cu 964+ 31 1073 +27 5.5+0.4 
Ti-8.5Cu 1023 +29 1180 + 21 21+0.6 
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ASTM standard’ for a Ti-6AI-4V alloy. c, Ductile fracture surface of Ti-3.5Cu 
showing small dimples. d, Fracture surface of Ti-6.5Cu showing a mixture of 
regions of small dimples with regions of cleavage facets. e, Brittle fracture 
surface of Ti-8.5Cu showing only cleavage facets. 


(see Extended Data Table 1). The size of equiaxed prior-B grains 
(Fig. 1b and Extended Data Fig. 4) and microstructural length- 
scales (Fig. 2b and Extended Data Fig. 7a, b) will probably also have 
an impact on the mechanical properties”. The fracture surfaces 
(Fig. 4c—e) show changes from dimples to a typical intragranular 
fracture morphology, which is consistent with the change in the 
ductility of the alloys. Compared with conventional casting and post- 
heat-treatment methods (Fig. 4b), the mechanical properties of the 
as-printed titanium-copper alloys with ultrafine equiaxed prior-B 
grains and eutectoid lamellar structure display a superior combina- 
tion of offset yield strength and ductility. The properties are also 
comparable to that of cast and wrought Ti-6Al-4V alloy’, as well as 
laser-metal-deposited Ti-6Al-4V alloy”. Furthermore, copper is a 
relatively low-cost alloying element and titanium-copper alloys can 
be additively manufactured with mixed elementary powders instead 
of with pre-alloyed powders. Titanium-copper alloys also have excel- 
lent antibacterial properties, good biocompatibility and corrosion 
resistance”? It is also anticipated that further improvements ina 
range of properties can be achieved through process manipulation 
using additive manufacturing. 

We have demonstrated a pathway to additively manufacturing 
titanium-copper alloys with both fine equiaxed prior-B grains and an 
ultrafine eutectoid lamellar structure. Our experimental results show 
that the solidification and subsequent eutectoid decomposition can 
be synergistically engineered to tailor mechanical properties to suit 
specific applications. This approach to grain refinement, using alloys 
with high Q values, has been demonstrated across many alloying sys- 
tems” and solidification processes and has been demonstrated here as 
adesign methodology for additively manufactured titanium alloys. The 
methodology is also likely to be applicable to other eutectoid systems 
suchas for pearlitic steels, in which the mechanical properties of these 
conventional alloys could be enhanced by additive manufacturing for 
high-performance engineering applications. 
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Methods 


Laser metal deposition 

Pure (99.9%) titanium and (99.5%) copper spherical powders (TLS Tech- 
nik and Thermo Fisher, respectively) with diameters between 50 um 
and 100 pm (see Extended Data Fig. 8) were blended ina Turbula shaker 
mixer for an hour to achieve the designed compositions. Laser metal 
deposition was performed ona Trumpf TruLaser cell 7020. Before 
manufacturing bulk samples, we used single-layer deposition to opti- 
mize the processing parameters onthe basis of visual observations of 
the weld bead. The optimized laser metal deposition parameters for the 
studied alloys are: laser power, 800 W; scanning speed, 800 mm min‘; 
laser spot size, 1.5 mm; powder flow rate, 2.0 rpm (1.7 g min”); hatch 
distance, 1.05 mm; shielding gas (argon) flow, 161 min“. The processing 
parameters were kept the same for all Ti-xCu alloys. Three cubes of 
10 x10 x10 mm? were built ona commercially pure titanium plate with 
different compositions (3.5 wt%, 6.5 wt% and 8.5 wt% copper). The laser 
scanning route for laser metal deposition was a raster pattern with an 
increment of 90° between each layer and the delay time between two 
subsequent layers was 20 s. For comparison, a Ti-6AlI-4V specimen 
was additively manufactured using the same parameters. 

For the tensile samples, three cuboids (120 x 25 x 25 mm?) were hori- 
zontally built and then machined into five tensile samples. The loading 
direction of tensile samples is perpendicular to the laser metal deposi- 
tion building direction. 


Chemical compositions 

The chemical composition of the as-printed samples was determined by 
inductively coupled plasma-atomic emission spectroscopy, as summarized 
in Extended Data Table1. Asmall amount of copper evaporation is expected, 
asits boiling point is 2,560 °C, much lower than that of titanium (3,285 °C)”. 


X-ray micro computed tomography 

The as-printed samples were scanned using X-ray micro computed 
tomography (GE Phoenix V tome) with a nominal resolution of 5 pm. 
Defect analysis including 3D image reconstruction, relative density, 
dimension and percentage of the defects was performed with Volume 
Graphics software. The as-printed specimens are all fully dense (>99.4%) 
without any lack of fusion defects. The porosity found in the as-printed 
specimens may come from the existing porosity in the powders (see 
Extended Data Fig. 8) as well as the manufacturing process. 


X-ray diffraction 

Phase identification was determined by X-ray diffraction (XRD) using 
a Bruker AXS D4 Endeavour diffractometer over a 20 range between 
15° and 90° at ascanning rate of 0.06° s7. 


Microscopy 

As-printed cube samples were cut along the central section parallel to 
the build direction. All the samples as well as raw powders were pre- 
pared by mounting, grinding and polishing. The samples for optical 
microscopy were etched by Kroll’s reagent to reveal the grain bounda- 
ries. Light optical microscopy witha polarized lens was used for exami- 
nation of the microstructure. SEM in BSE mode was carried out using 
a FEI Verios 460L. Fracture tomography was analysed using SEM in 
secondary electron mode. 


The average grain size was measured from five optical micrographs 
of each alloy using the linear intercept technique. Volume fraction of 
lamellae phase was calculated from three BSE microstructure images 
(5,000x magnification) by using the colour threshold. 

For transmission electron microscopy (TEM) sample preparation, 
SEM with a focused ion beam (FEI Scios) was used to prepare site- 
specific TEM foils. Then, scanning transmission electron microscopy 
and XEDS mapping was performed in an image-corrected Titan3 G2 
60-300 (S)TEM equipped with FEI’s ChemiSTEM technology. 


Solidification simulation 

Equilibrium and Scheil-Gulliver solidification models were simulated 
using Pandat software with PanTitanium database (version 2018). The 
Qvalues for Ti-6Al-4V and Ti-8.5Cu were determined from Scheil 
cooling curves”. 


Tensile testing 

As-printed samples were machined into rectangular tension test 
specimens with gauge length of 25 mm and thickness of 4 mm (sub- 
size specimen of ASTM standard E8/E8 M-08). The tensile test loading 
direction is perpendicular to the laser metal deposition building direc- 
tion. Quasi-static uniaxial testing was carried out at room temperature 
with an initial strain rate of 1.0 x 10s ona universal testing facility 
(MTS810, 100 kN) equipped with a non-contact laser extensometer. 
Five tensile specimens were tested for each composition (see Extended 
Data Fig. 9). The results were then compared with ASTM standards for 
standard-size specimens. 


Data availability 


The datasets generated or analysed during the current study are avail- 
able from the corresponding author on reasonable request. 
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Extended Data Fig. 1| Ti-Cu phase diagram. Portion of the Ti-Cu phase 
diagram indicating the compositions selected for laser metal deposition. We 
selected 3.5, 6.5 and 8.5 wt% copper to explore the behaviour of hypo- 
eutectoid, eutectoid and hyper-eutectoid compositions under additive 
manufacturing. This figure is adapted from ref.*°, with the permission of ASM 
International. 


Article 


Relative Density (%) 


Ti-3.5Cu Ti-6.5Cu Ti-8.5Cu 


Extended Data Fig. 2|3D visualization of the porosity of the manufactured specimens in the xyz coordinate system. a, Ti-3.5Cu. b, Ti-6.5Cu.c, Ti-8.5Cu. 
d, Calculated relative density of the as-printed specimens. Error bars represent one standard deviation. 
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Extended Data Fig. 3| XEDS results of the copper content along the building direction for Ti-8.5Cu alloy. The base point is O mm and the chemical composition 
is homogeneous. Error bars represent one standard deviation. 


Article 


400 jim 


ey 


Extended Data Fig. 4 | Polarized optical microstructures. a, b, The equiaxed grains of as-printed Ti-3.5Cu (a) and Ti-6.5Cu (b). The average grain size is 69.8 ym 
for Ti-3.5Cu and 16.3 um for Ti-6.5Cu. 
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Extended Data Fig. 5 | Solidification curves. The data are shown for different copper compositions under equilibrium and Scheil conditions. The Scheil curves 
showasubstantially enlarged temperature interval between liquidus and solidus temperatures compared with the equilibrium condition. 
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Extended Data Fig. 6 | XRD spectra. Experimental XRD spectra collected from the as-printed Ti-8.5Cu alloy indicates that only two phases are presentin the 
specimen: a-phase titanium and Ti,Cu. 
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Extended Data Fig. 7 | BSE images. a-d, BSE images of as-printed specimens deposited for Ti-3.5Cu (c) and Ti-6.5Cu (d). Images were taken at the first layer 
showing the fine a phases when multiple layers were deposited, for Ti-3.5Cu of build specimens, indicated by the red spots. 


(a) and Ti-6.5Cu (b); and the martensite phase when only asingle layer was 
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Extended Data Fig. 8 | SEMimages of the cross-section of raw powders. and 100 pm, and porosity can be observed within some powder particles. The 
a, b, SEMimages of the titanium powder (a) and copper powder (b) cross- yellow arrows indicate examples where powder particles fell out of the resin 


sections. The powders are spherical in shape with a diameter between 50 um during the polishing process. 
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Extended Data Fig. 9 | Engineering stress-strain curves. The data for the additively manufactured materials tested in this study indicate good repeatability. 
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Extended Data Table 1| Measured chemical compositions (wt%) and volume fraction of eutectoid lamellae in the as-printed 
alloys 


Alloy (nominal Eutectoid 
a Cu N O Ti 
composition) lamellae (%) 
Ti-3.5Cu 3.20 0.01 0.22 Bal. 0 
Ti-6.5Cu 6.33 0.02 0.23 Bal. 5327 
Ti-8.5Cu 8.36 0.02 0.21 Bal. 92+4 


Errors represent one standard deviation. 
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Seismological data provide evidence of a depth-dependent rupture behaviour of 
earthquakes occurring at the megathrust fault of subduction zones, also known as 
megathrust earthquakes’. Relative to deeper events of similar magnitude, shallow 
earthquake ruptures have larger slip and longer duration, radiate energy that is 
depleted in high frequencies and have a larger discrepancy between their surface- 
wave and moment magnitudes! °. These source properties make them prone to 
generating devastating tsunamis without clear warning signs. The depth-dependent 
rupture behaviour is usually attributed to variations in fault mechanics* ’”. Conceptual 
models, however, have so far failed to identify the fundamental physical causes of the 
contrasting observations and do not provide a quantitative framework with which to 
predict and link them. Here we demonstrate that the observed differences do not 
require changes in fault mechanics. We use compressional-wave velocity models from 
worldwide subduction zones to show that their common underlying cause is a 
systematic depth variation of the rigidity at the lower part of the upper plate — the 
rock body overriding the megathrust fault, which deforms by dynamic stress transfer 
during co-seismic slip. Combining realistic elastic properties with accurate estimates 
of earthquake focal depth enables us to predict the amount of co-seismic slip (the 
fault motion at the instant of the earthquake), provides unambiguous estimations of 
magnitude and offers the potential for early tsunami warnings. 


Subduction megathrust earthquakes result from episodic, unstable 
sliding within the seismogenic zone’, a fault segment that is thought 
to extend from about 40-50 km to about 5-10 km depth. Great earth- 
quakes initiating within the seismogenic zone can propagate updip 
from this limit, as evidenced for the 2011 Tohoku-Oki event? (moment 
magnitude, M,, of 9.1) and 2010 Maule event (M,, 8.8)!°, while events 
forming a particular class known as ‘tsunami earthquakes’ appear to 
rupture only the shallowest, allegedly non-seismogenic part of the 
megathrust” (Extended Data Fig. 1). The seemingly anomalous char- 
acteristics of shallow ruptures suggest a depth dependency of the 
rupture process! 3, commonly attributed to changes in fault proper- 
ties*’. However, current conceptual models trying to explain the dif- 
ferences are qualitative and case-dependent; they treat the different 
rupture characteristics individually, as if they were caused by unre- 
lated factors, and do not pinpoint the primary physical causes. Slow 
rupture propagation” and large slip”, for instance, are commonly 
attributed to the presence of weak subducting sediment”, whereas 
pore-pressure-related weakening** and a depth-dependent distribu- 
tion of initial stresses® have also been proposed to explain large slip 
and high-frequency depletion. None of these models has been used 
to explain the remarkable discrepancy between M,, and surface-wave 
magnitude, M,, for shallow earthquakes. 


We propose a conceptual change to this unsolved question. Our 
hypothesis is that changes in fault mechanics are not necessarily 
required to explain the observed depth-dependent trends of the rup- 
ture characteristics. Instead, we postulate that the trend mainly reflects 
depth variations of the elastic properties of the overriding plate at a 
larger scale. This hypothesis stands on the fact that downgoing oceanic 
slabs and overriding plates exhibit contrasting patterns of permanent 
deformation’””’ (Fig. 1). Overriding plates display widespread contrac- 
tional structures indicating a dominant sub-horizontal principal com- 
pressional stress, whereas oceanic plates are dominated by extensional 
faulting, implying a near 90° rotation of the orientation of the principal 
stresses across the megathrust. Sedimentary strata of underthrust- 
ing plates have sub-horizontal attitude, typically lack contractional 
deformation and are cut by normal faults, supporting the idea that the 
principal compressional stresses are sub-vertical immediately below 
the megathrust fault. Thus, the distribution of tectonic structures and 
the inferred orientation of principal stresses support the idea that the 
elastic energy released during megathrust earthquakes has accumu- 
lated in overriding plates (Fig. 1). Correspondingly, co-seismic defor- 
mation should affect overriding plates, with negligible effect on the 
underthrusting plates. Hence, the recorded tectonic history indicates 
that the elastic properties of the overriding plate need to be considered 
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Fig. 1| Tectonic structure of the shallow region of two types of subduction 
zones where tsunamis are generated. a—c, Depth-migrated multi-channel 
seismic image of the Java Trench (a). The overriding plate is made of accreted 
sediment thrust sheets, with different structure where the prism is more than 
about 5 km thick (inset b), and in the front, where it is less than about 5 km thick 
(inset c). Thrust at the front gradually rotated as material accumulated, 
thickening the prism landward (c), but when thrusts are too steep to continue 
slipping, out-of-sequence thrusts and folding further thicken and compact the 
prism (b). d-f, Pre-stack depth migration of the Japan Trench dominated by 
tectonic erosion”. The igneous basement flexes accumulating elastic energy 
and is cut by normal faults in its upper section (inset e; no vertical 
exaggeration). The frontal -25 kmis asediment prism less than about5 km 
thick, with thrust faulting (inset f; no vertical exaggeration). Both margins (a, d) 


to understand the earthquake phenomena, given the constraints that 
they impose on dynamic stress transfer during co-seismic slip. 

Our hypothesis implies that differences in rupture behaviour should 
be predictable and quantifiable ifthe depth distribution of elastic prop- 
erties is known, and this information could be used toimprove tsunami 
hazard assessment. To test it, we used 48 compressional-wave velocity 
(V,) models obtained with travel-time modelling of wide-angle reflec- 
tion and refraction seismic profiles across circum-Pacific and Indian 
Ocean subduction zones (Extended Data Fig. 1 and Extended Data 
Table 1). We averaged V, at the lower part of the overriding plate as a 
function of interplate boundary depth below seafloor, from the surface 
to approximately 25 km depth (Fig. 2 and Methods). The travel-time- 
based Vp models allow the rock volume encompassing the propagating 
rupture front to be resolved (Extended Data Fig. 3 and Methods). The 
global V,(z) trends of accretionary and erosional margins, where zis 
interplate boundary depth below the seafloor, display slight differences 
at depths shallower than 5 kmand gradually converge below this depth 
(Fig. 2b and Extended Data Fig. 4a). V,(z) variations probably reflect 
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show contraction structures in the overriding plate indicating sub-horizontal 
main compressional stress. However, under the mega-thrust fault, the 
downgoing plate displays a fundamentally different structure: The top of the 
oceanicigneous crust is traced from the incoming plate into the 
underthrusting slab, overlaid by a layer of little-deformed sediment strata. The 
oceanic plate is characterized by horst and graben associated to bend-faulting, 
indicating a sub-vertical main stress. We interpret that the properties of the 
rock body deforming during rupture propagation (in red in the images) should 
change considerably within about the frontal 50 km of the margin. Stresses will 
be transferred through relatively consolidated material at about 10 km depth 
to progressively more fractured material at about 5 km depth, anda highly 
disaggregated upper plate, in the thinnest frontal 15 km or so of the overriding 
plate. cmp, common mid-point. P-7 and P-849 indicate the line profiles shown. 


differences in rock nature between the two margin types in the shallow 
part, anda progressive rock compaction and a decrease in fracturing 
at deeper levels, as suggested by seismic images (Fig. 1). On average, 
V, increases by a factor of 2.0-2.5, from about 3.0 kms7at1km depth 
to about 6.5 kms‘ at 25 km depth (Fig. 2c), with gradient decreasing 
downwards (Extended Data Fig. 5a). 

We then use V, to derive rigidity (u = pV2, where p is density and V; 
is shear-wave velocity), which affects important aspects of earthquake 
rupture. In the absence of direct V,(z) measurements, we apply well- 
established empirical relationships for p(V,) and V,(V,) from experi- 
mental data on multiple rock types” (see Methods). The resulting 
distributions of p(z) and V,(z) are shown in Extended Data Fig. 3, and 
that for p1(z) is shown in Fig. 2d. The trend shows a4—5-fold increase in 
ybetween the surface and about 25 km depth, from <10 GPa at 1-2 km 
depth to 40-45 GPa at 20-25 km depth, with gradient decreasing down- 
wards (Extended Data Fig. 5b). Based on the observed rates of variation, 
we define three domains along the megathrust: shallow (O-5 km), 
transitional (5-10 km) and regular (10-25 km) (Fig. 2d). 
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Fig. 2| Convergent margin structure and P-wave velocity at the lower part of 
overriding plates. a, Conceptual model with main geological features of 
convergent margins, based on geophysical data of Central and South America. 
Decrease in V, and intensified faulting towards the trenchis interpreted to 
reflect increasing porosity and fracturing degree (see also Fig. 1). b, Coloured 
circles show digitized V, values of the lower part of the upper plate just above 
theinterplate boundary, as a function of interplate boundary depth below 
seafloor (depthb.s.) (z). Red circles correspond to accretionary margins, 
yellow circles to erosional margins. Location of profiles in the compilation is 


The depth trend of elastic properties within the three domains 
strongly conditions the predicted differences in rupture character- 
istics. To show this, we compare predicted ratios of amount of slip, 
rupture duration and corner frequency, as well as M,, — M,, for earth- 
quakes of equal magnitude and equal stress drop”’, computed from 
classical self-similar source theory taking as a reference the values 
at 25 km depth (Fig. 3 and Methods). Our results show that, for all the 
source properties considered, relative changes are concentrated in the 
shallow domain. For a given earthquake magnitude and stress drop, 
the predicted amount of slip is about ~5-10 times larger in the shallow 
domain than inthe regular domain (Fig. 3a), whereas rupture duration 
is 2-3 times larger (Fig. 3b), and corner frequency (f,) 1-2 octaves lower 
(Fig. 3c). The f. lowering implies that shallow earthquakes should be 
depleted in high frequencies. The high-frequency depletion produces 
a depth-dependent discrepancy between M,, and M, because these two 
earthquake magnitudes are based on data at different frequencies 
(Extended Data Fig. 6b). The predicted M,—M, difference for a M, 7.5 
event is 0.2-0.3 in the regular domain but increases to 0.6-0.8 in the 
shallow domain because of the decrease inf, (Fig. 3d). Figure 4 presents 
a conceptual model summarizing all these predictions. 

The obtained values agree with average trends of rupture properties 
of natural examples. One example is tsunami earthquakes, infrequent 
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shown in Extended Data Fig. 1. Additional information and references are 
provided in Extended Data Table 1. Blue lines with arrowheads indicate 
correspondence between different depth domains inaand V, distribution (b). 
c, Orange circles show average V, as a function of z, obtained by averaging V,(z) 
values in b within a 1-km-thick sliding window. d, Blue circles showyasa 
function of z, obtained from p(z) and V,(z) in Extended Data Fig. 4. Data point 
values inc and dare fitted with a fourth-order polynomial regression (black 
lines). Error bars represent one standard deviation. 


but well-documented events that rupture only the shallowest megath- 
rust and generate anomalously large tsunamis for their magnitude” 
(Extended Data Fig. land Extended Data Table 2). These events display 
all the characteristics of shallow ruptures, including long duration, 
high-frequency depletion inducing subdued seismic shaking, and 
large M,, - M, discrepancy”. These characteristics, however, are 
not unique to tsunami earthquakes. Great earthquakes rupturing from 
deep inthe seismogenic zone to close to the trench axis exhibit similar 
rupture properties in their shallow-depth portion’. 

Studies of tsunami earthquakes based on seismological, geodetic 
and tsunami modelling support the idea that slip not only concentrated 
in the shallow domain, but it also increased upwards to peak near the 
trenchaxis®”?, Likewise, slip of some great tsunamigenic earthquakes 
appears to have peaked close to the trench, most clearly for the M, 9.1 
Tohoku-Oki event in 2011, with maximum slip exceeding 50 m near 
the trench axis’, and for the M,, 8.8 Maule event in 2010”. Current 
understanding attributes large shallow slip to the frictional proper- 
ties of fault-rock materials”, or to local features such as near-trench 
slumps”°, and to subducting relief’’. However, the 4-5-fold decrease in 
pinthe shallowest part of the megathrust (Fig. 2d) implies a5-10-fold 
increase inslip relative to regular earthquakes of the same size (Fig. 3a). 
This trenchward slip increase is consistent with the large shallow slip 
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Fig. 3 | Predicted earthquake rupture and energy release characteristics. 

a, Blue circles showslip ratio (D,) for an earthquake of a given magnitudeasa 
function ofinterplate boundary depthb.s. (z). Dgis calculated taking as 
reference unit the slip at z=25 km. b, Red circles show relative mode III rupture 
duration (7,) for an earthquake ofa given magnitude, asa function of z. Tpis 
calculated taking as reference the rupture duration per unit lengthatz=25km. 
c, White circles show corner frequency ratio (fg) for an earthquake ofa given 
magnitude asa function of z.f, is calculated taking as reference unit the corner 
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Fig. 4| Conceptual model of megathrust seismogenic zone domains. 
Schematic diagram shows main geological and tectonic features of the upper 
and subducting plate based on geophysical data of Central and South America. 
Dotted black lines in the upper plate indicate isovelocity contours. Overriding 
plate faulting and fracturing increases towards the trench, thereby reducing V», 
p, V;and rigidity (4). Deformation and fracturing are concentrated above 
subducting sediment and interplate boundary. The diagram also shows 
differences of earthquake rupture and energy release characteristics for 
megathrust earthquakes occurring within the shallow (red) and regular (blue) 
domains discussed in main text. Red (blue) ellipses labelled EQ, (EQ,) are 


frequency at z=25 km. The dashed and dot-dashed black lines show 
frequencies one and two octaves lower than the reference one, respectively. 
d, Black lines show the difference between M,, and M, asa function of z, for 
earthquakes of M,,=6.5, 7, 7.5 and 8. Orange, yellow and white rectangles in all 
panels indicate the depth extension of the shallow, transitional and regular 
domains referred to in the text. See details on the calculations in the main text. 
Data point values in a-care fitted with a fourth-order polynomial regression 
(black lines). The error bars represent one standard deviation. 
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rupture zones of the same size occurring within the shallow (regular) domains. 
Depth-dependent changes in elastic properties quantitatively explain that, 
compared with EQ,, EQ, should have propagation velocity up to 2-3 times 
slower, so 2-3 times longer duration; 5-10 times larger slip, so high 
tsunamigenic potential; 1-2 octaves lower corner frequency, so high-frequency 
depletion and subdued seismic shaking; and a 3-4 times larger M,,- M, 
discrepancy, of up to 0.6-0.8 for a M,, 7.5 earthquake (Fig. 3). The model 
predicts that the shallowest ruptures, and especially those reaching the trench 
axis, should show the largest differences with respect to regular events inall 
the analysed attributes, as is observed in natural examples. 
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required to generate large tsunamis by either tsunami earthquakes”"* 


or great earthquakes rupturing to the trench”. Specifically, a 5-fold 
reduction of 1 between regular and shallow domain depths was inferred 
to fit tsunami wave amplitudes of the M, 7.0-7.2 Nicaragua tsunami 
earthquake in 1992” (Extended Data Fig. land Extended Data Table 2). 

The slow rupture propagation and long duration compared with 
deeper events of the same magnitude is a key characteristic of tsu- 
nami earthquakes, to the point that they are often referred to as ‘slow 
tsunami earthquakes’. Their average propagation velocity is about 
1-2kms"‘(ref.*”), whereas for deeper events it is about 3 kms", inagree- 
ment with predicted propagation velocity differences between the 
shallow and regular domains (Extended Data Fig. 4c). The predicted 
increase of source duration also agrees with observations of normal- 
ized duration for M, 5.0-7.5 earthquakes in circum-Pacific subduction 
zones” (Extended Data Fig. 7a). These data show that the duration 
of earthquakes shallower than about 10 km depth, which include six 
tsunami earthquakes, is 2-3 times that of deeper events, as predicted 
by our model (Fig. 3b and Extended Data Fig. 7b). Smaller-magnitude 
events occurring within the rupture areas of the M,, 9.12004 Sumatra- 
Andaman” and the M, 9.12011 Tohoku-Oki*? earthquakes also show 
longer normalized duration in the near-trench zone. 

Another characteristic of tsunami earthquakes is a high-frequency 
deficit relative to regular events of equal magnitude”. The resulting 
ground shaking is weaker, and tsunami hazard based on human per- 
ception is therefore underestimated. This was the case of the 1992 
Nicaragua earthquake, where mild shaking caused little damage, and 
the tsunami hit the coast unexpectedly. But this feature is not unique 
to tsunami earthquakes. Seismological data of recent great tsunami- 
genic earthquakes support a pattern of two distinct rupture modes 
for the M, 9.12004 Sumatra-Andaman'’, the M,, 9.12011 Tohoku-Oki”, 
the M,,8.8 2010 Maule” and the M,8.3 2015 Illapel events (Extended 
Data Fig. 1 and Extended Data Table 2). Those earthquakes initiated 
deep into the seismogenic zone with rupture radiating high-frequency 
energy and producing strong shaking, followed by shallowrupture with 
lower-frequency content that generated large seafloor deformation 
and originated the tsunamis. The trend of higher-frequency content 
in the regular domain (Fig. 3c and Extended Data Fig. 6a) due to the 
spectral amplitude decay (Extended Data Fig. 6b) can also be explained 
by the depth-dependent overriding rock properties, without calling 
for a hypothetical depth-dependent stress drop trend that is barely 
supported by seismological data”. 

Owing to high-frequency depletion, the initial magnitude estimation 
of the 1992 Nicaragua earthquake was M, 6.8 (later corrected to 7.0-7.2), 
too low for atsunamialert to be issued. However, the M,, 7.6-7.8 calcu- 
lated after more detailed data analysis, if available earlier, would have 
prompted the alert. On average, tsunami earthquakes have |M,,| ~ 7.6 
and |M,, — M,| = 0.65 (Extended Data Table 2), so that they have larger 
M,,- M, differences than regular earthquakes of the same magnitude. 
These M,,— M, discrepancies are difficult to explain for earthquakes of 
magnitude M,, 7.6 rupturing the regular domain, where M, — M, should 
be less than about 0.3 (Fig. 3d). However, the depth-dependent elastic 
properties imply that the M,,— M, discrepancy for earthquakes of this 
magnitude can increase to 0.6—0.8 when rupture concentrates in the 
shallow domain (Fig. 3d), in agreement with observations. 

Although shallow megathrust earthquake ruptures are infrequent, 
their slip distribution peaking near the trench makes them particularly 
hazardous. Extended Data Fig. 8 shows that a M,, 7 earthquake rupturing 
the regular domain has the same spectral amplitude at 20s asaM,,8 
event rupturing the shallow domain, and thus the same M,, if depth- 
dependent changes of elastic properties are taken into account. The 
associated tsunami hazard of these two events is radically different, 
but it cannot be forecast based on M,. Proper magnitude and tsunami 
hazard evaluation require incorporating focal depth information and 
the local V,(z). In the interim lack of local velocity models, tsunami 
forecast can be improved using the global trends obtained here (Fig. 2). 
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Insummary, interplate fault mechanics may play a role controlling 
different aspects of the seismic cycle but do not seem to be required 
to explain the overall depth-dependent trend of the source properties 
considered here. We show quantitatively that the observed characteris- 
tics of both shallow and regular earthquakes reflect the elastic proper- 
ties of the rock volume undergoing dynamic stress transfer. However, 
our model uses average physical properties to explain global trends of 
source characteristics rather than individual examples. The observed 
variability of physical properties across different systems (Fig. 2b) 
implies that proper analysis of particular seismic events would require 
determination of the elastic properties throughout their specific rup- 
ture zone. These properties should be incorporated into numerical 
models to allow evaluation of the potential effect of complex fault 
mechanics on rupture characteristics. 
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Methods 


Joint reflection and refraction travel-time modelling of 
wide-angle seismic data 

The methods used to obtain the 2D V, models included in our compi- 
lation (Extended Data Table 1) include both forward* and inverse® ”” 
techniques. All the selected profiles (Extended Data Fig. 1) share 
two key aspects. First, they are travel-time-fitting techniques based 
on ray-tracing approaches, and second, they include seismic phases 
reflected at the interplate boundary. The joint modelling of first arrival 
(that is, refracted waves within the overriding and subducting plates) 
and interplate reflection travel-times allow mapping of not only the V, 
structure but also the location and geometry of the interplate bound- 
ary, from the trench to depths ranging between about 15 km and 30 km, 
depending on data quality and experiment set-up. When 2D V, models 
and multichannel seismic data (Fig. 1) are spatially coincident, they 
can be combined to improve the geological interpretation of seismic 
velocities (Extended Data Fig. 2). Monte-Carlo-type statistical analysis 
of several profiles, with multiple inversions using different initial 
models and assuming realistic travel-time picking errors, provides 
V, uncertainty. Above the interplate boundary, this is typically 
0.05-0.1kms at the shallowest megathrust sector and 0.2-0.3kms? 
at about 25 km depth? *#! 


Resolution of travel-time-based seismic modelling versus 
wavelength of the stress wavefield 

Using seismic velocity models to infer earthquake rupture-related 
properties implicitly assumes that rupture propagation is affected 
by the properties of a rock volume that can be resolved by V, models. 
Rupture initiation depends on the stress distribution surrounding 
the crack tip, and the subsequent rupture propagation and material 
deformation reflect the dynamic stress transfer around the crack tip”. 
Rupture propagation velocity is limited by the speed at which stresses 
can propagate through the material (that is, V; for mode Ill rupture). 
Additionally, near-field ground motion recordings of large subduction 
earthquakes consistently display a peak frequency f,, of about 1-4 Hz 
(refs.***). This implies that the stress transfer, whose limiting propa- 
gation velocity along the megathrust varies with depth as indicated in 
Extended Data Fig. 4c, has an associated wavelength A,,,(z) = V;(Z)/fow, 
ranging from about 0.5-1.5 km near the surface to about 1.5-4.0 km at 
25 km depth (blue-lilac polygon in Extended Data Fig. 3). 

Modern wide-angle seismic data can resolve V, of the rock body 

equivalent to the wavelength of the stress wavefield. For ray-based, 
travel-time-fitting methods such as the ones used in this study 
(Extended Data Table 1), model resolution is inferred to be limited to 
the width of the first Fresnel zone, R, = /(zVp/2f ), where zis the imaged 
target depth, V, is P-wave velocity and f, is the peak frequency of the 
seismic source. Taking the V,(z) values in Extended Data Fig. 2c and 
f,=8-12 Hz, which is the typical frequency content of records, we obtain 
R,(z) ranging from about 0.3-0.4 km near the trench to about 2.5-3.5 km 
at 25 km depth (light red area in Extended Data Fig. 3). The comparable 
size of the region resolved by travel-time-based velocity models at all 
depths and the wavelength of the seismic wavefield associated to rup- 
ture propagation (Extended Data Fig. 3), supports the idea that the 
modelled V,(z) in Fig. 2, as well as other V,-derived properties (Fig. 2d 
and Extended Data Fig. 4), represents the physical properties influenc- 
ing the dynamic stress transfer associated to the propagating seismic 
rupture. 


Estimation of V,(z) above the interplate boundary 

To obtain the V, values as a function of interplate boundary depth 
below the seafloor (Fig. 2), we first digitized V, just above the inter- 
plate boundary, interplate boundary depth, and seafloor depth/land 
topography, along the 48 wide-angle seismic profiles (Extended Data 
Table 1). Second, we interpolated V,, interplate boundary depth below 


sea surface (z;), and seafloor depth/land topography (z,) at constant x 
intervals (2 km) along each profile by applying Akima splines to obtain 
V,asa function of upper plate thickness, z=z, — Z), using Generic Map- 
ping Tools (GMT)**. For simplicity, this value is referred to as ‘interplate 
boundary depth b.s.’ throughout the manuscript and in the figures. 
Third, we interpolated V, at constant zintervals (1km) along each pro- 
file, again using GMT. Fourth, for each z between 1 km and 25 km, we 
calculated the average V, value of all profiles and its corresponding 
standard deviation. Finally, we used GMT to calculate a fourth-order 
polynomial regression fit of the V,(z) values. 


Derivation of rock properties from V,(z) 

The V,(z) values shown in Fig. 2c are used as a reference to calculate 
the rest of physical properties presented throughout the manuscript 
as a function of z. The shear-wave modulus, or rigidity, u=pV2 
(Fig. 2d), is obtained by applying first empirically based relations 
proposed in ref. ” to estimate p (Extended Data Fig. 4b) and V, 
(Extended Data Fig. 4c) from V,. For density, the relation is p= 
1.6612 Vp — 0.4721V2 + 0.0671V3- 0.0043V; + 0.000106V; , whereas 
for shear velocity, it is Vs= 0.7858 - 1.2344 Vp + 0.7949V2 - 0.1238V3 
+0.0064V;. Both relationships are based ona compilation of V,, V;and 
p measurements for a wide variety of Earth crustal rocks, obtained 
from wireline borehole logs, vertical seismic profiles, laboratory meas- 
urements and seismic tomography models. 


Derivation of depth-dependent earthquake rupture 
characteristics 

Amountof slip. The seismic moment released during co-seismic rup- 
tureis My =fi pDds = f1DS, where Dis co-seismic slip, wis rigidity, dsis 
a surface element, and the integral runs over the whole rupture area, 
S. If we take average values over S (Dand jf), M, is approximated by the 
right-hand side. M, and S can also be estimated from waveform data 
and aftershock location, respectively, but u and D are generally un- 
known, so they have a trade-off in the calculations. To show the effect 
of u(z) for events of equal M, and S, we compare differences in the 
amount of slip as a function of depth as aslip ratio D,(z). Using as a unit 


reference the amount of slip at the regular domain depth of 25 km, D,(z) 


D@)_ 


can be calculated as Dp(z) = = = ie)’ where D* and p* are the amount 


of slip and rigidity at the reference depth of 25 km, and D(z), u(z) are at 
depth z. Figure 3a displays D,(z) calculated by taking u(z) from the 
global compilation (Fig. 2d). 


Rupture duration. Given that stresses must accumulate at the crack 
tip for spontaneous propagation, rupture velocity (u) is limited by the 
velocity at which stresses are transmitted throughout the material. For 
mode lll cracks, in which a shear stress acts parallel to the plane of the 
crack and parallel to the crack front, as in megathrust fault ruptures, 
the theoretical limiting velocity is V;. Field data indicate that wis actu- 
ally 70-90% of V, at the depth of the largest slip. Thus, the observed 
V,(z) trend above the megathrust (Extended Data Fig. 4c) implies that 
ushould be significantly lower in the shallow domain than in the regu- 
lar domain. To illustrate this effect, we calculated the normalized source 
duration for a unit rupture length at different depths, 7,(z), using u(z) 
in Extended Data Fig. 4c and taking as a reference the rupture duration 


at the regular domain depth of 25 km, as 7,(z) = tt = vat = wa 


T*,u* and V,* are rupture duration per unit length, rupture velocity and 
shear velocity at the reference depth of 25 km, and 7(z), u(z), V;(z) are 
at depth z. 


where 


Corner frequency. The V,(z) distribution also influences the frequen- 
cy content of the energy released during an earthquake. This can be 
estimated from the spectral shape of the moment-rate spectrum, M(f), 
whichis calculated based on waveform records. The reference spectra 
to compare with values obtained from observational data are 


M(f)= we , where f, is the corner frequency, which marks the fre- 


quency at which the spectrum starts to decay, and nis the slope of the 
spectral decay. A value of n=2is typically used asa reference, as it fits 


the observed decay in most cases, whereas f= che( 37) 3 where 
fo} 


c=0.49 is a dimensionless constant, and Agis stress drop. Thus, V; of 
the region enclosing the propagating rupture front determines f.and 
therefore the spectral shape. In contrast to V, and Vz, there is no clear 
evidence indicating a systematic, universal depth-dependence 
of Agin subduction zones”’. Therefore, we assume that it is depth-in- 
dependent, and we isolate the V,(z) effect on the corner frequency 
ratio, f,(z), for events of equal M,, using f.(V;). Taking as a reference 
the f. and V, at the regular domain depth of 25 km, we have 
f= oe = se = T(z) 1, where f.*and V,*are the corner frequency 
and shear-wave velocity at the reference depth of 25 km, and f(z), V;(z) 
and 7,(z) are at depthz. 


Moment magnitude versus surface wave magnitude. Another con- 
sequence of the V;-dependent high-frequency depletion is a depth- 
dependent discrepancy between the earthquake moment magnitude 
M,,, estimated from long-period waves (approximately 250 s), and its 
surface wave magnitude M,, estimated at shorter periods (about 20 s) 
(Extended Data Fig. 6b). This effect is illustrated in Fig. 3d, where each 
curve represents the difference between M,, and M, calculated 
from the moment-rate spectrum as a function of depth M(f) = ee 
for four different M,,. Given that f, is anticorrelated with M, according 


to f. =cVs (iz) 3M, tends to saturate for large magnitudes, so that the 


M,,— M, discrepancy increases with increasing M,,. However, Fig. 3d 
shows that the discrepancy for any magnitude is also depth-dependent. 


Data availability 


Source data for Figs. 2 and 3 and for Extended Data Figs. 4-8 are pro- 
vided with the paper. The digitized values of P-wave seismic velocity 
above interplate boundary versus depth and seafloor depth along 
the 48 wide-angle seismic profiles used here are available at the 
public research data repository figshare (https://doi.org/10.6084/ 
m9.figshare.9729302.v1). 


Code availability 


The scripts necessary to process the data and reproduce the main 
results and figures presented in this work are available at the pub- 
lic research data repository figshare (https://doi.org/10.6084/ 
m9.figshare.9729302.v1). 
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Extended Data Fig. 1| Location map of seismic profiles and recent great and 
tsunami earthquakes. Colour-coded relief map of seafloor (blue-green) and 
emerged land (grey). Circles indicate location of the trench-crossing refraction 
and wide-angle reflection seismic (WAS) profiles used in this study. Yellow- 
filled circles are in erosional margins and red-filled circles in accretionary 
margins. The location, type of margin and references for all profiles are listed 
in Extended Data Table 1. Numbered white stars show locations of 12 events 
recognized as tsunami earthquakes according to the definition of Kanamori”: 
(1) M,,7.6,1992 Nicaragua; (2) M,, 7.6, 1960 Peru; (3) M, 7.5, 1995 Peru; (4) M, 7.2, 
1947 Hikurangi; (5) M,, 7.1, 2010 Solomon; (6) M,, 7.6, 1994 Java; (7) M,, 7.8, 2006 
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Java; (8) M, 7.8, 2010 Mentawai; (9) M,, 8.0, 1896 Sanriku; (10) M, 7.5, 1975 Kurile; 
(11) M,, 7.8, 1963 Kurile; (12) M, 8.2, 1946 Aleutian. Orange polygons display the 
rupture areas of the six largest megathrust earthquakes since 1960: M,, 9.5, 
1960 Valdivia; M,, 9.2, 1964 Alaska; M,, 9.1, 2004 Andaman Islands; M,, 9.1, 2010 
Tohoku-Oki; M,, 8.8, 2010 Maule; M,, 8.7, 1965 Rat Island. Hypocentral location, 
date, magnitude and references for all these earthquakes are listed in Extended 
Data Table 2. This figure has been created using the GMT software package“. 
The topographic and bathymetric relief data have been taken from the GEBCO 
Digital Atlas data set*”. Authors are not aware of any disputed territories shown. 
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Extended Data Fig. 2| Superposition of multichannel seismic image and 

P-wave velocity model. Example of superposition of a V} model (colour, see 
scale) ona spatially coincident multichannel seismic image (shading) along 
profile NIC-20, acquired in the convergent margin of Nicaragua. This profile 


crosses the rupture area of the 1992 Nicaragua tsunami earthquake (Extended 
Data Fig. 1). Black lines show isovelocity contours with their corresponding 
velocity values. White circles indicate the approximate location of the 
interplate boundary, where megathrust earthquakes take place. 
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Extended Data Fig. 3 | Resolution of V, models and wavelength of rupture 
stress wavefield. The light red polygon displays the width of the Fresnel zone 
as a function of depth, assuming V,(z) in Fig. 2a and energy sources with 
minimum (maximum) peak frequency f,=8 Hz (12 Hz). The blue-lilac polygon 
indicates the approximate wavelength of the stress wavefield associated to 
earthquake rupture propagation (A,,), assuming V,(z) in Extended Data Fig. 5c 
as propagation velocity, and near-field ground motion spectra with minimum 
(maximum) peak frequency f,,, =1Hz (4 Hz) (see Methods for details). 
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Extended Data Fig. 4| Physical properties versus interplate boundary above the interplate boundary, as a function of z, obtained by applying the 
depth.a, Red (yellow) circles show V, as a function of z. It is obtained by V.(V,) relationship from Brocher (ref."’). The shaded polygon covers the range 
averaging digitized V, values of accretionary and erosional margins (red and of possible mode III rupture velocities, as a function of z, according to field 
yellow circles, respectively, in Fig. 2b). b, White circles show density (p) just observations: u(z) = (0.7-0.9)V,(z). The black line is a fourth-order polynomial 
above the interplate boundary, as a function of z, obtained by applying regression fit of the V,(z), o(z) and V;(z) values, respectively. The size of the 


Brocher’s p(V,) relationship”. c, Red circles show shear-wave velocity (Vs) just error bars in all cases is one standard deviation. 
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OVp(z)/dz. It corresponds to the derivative of the V,(z) polynomialregressionfit —_4(z) polynomial regression fit (black line in Fig. 2d). 


Article 


a 


Corner frequency (Hz) 
10°? 19" 


= 
Oo 


=i 
a 


Interplate boundary depth (km) 


N 
oO 


25 


1071 


1020 


10'9 


Moment rate (N-m) 


10'8 


10°3 10-2 1071 
Frequency (Hz) 


Extended Data Fig. 6 | Corner frequency and moment rate spectra. 

a, Coloured circles show corner frequency asa function of interplate boundary 
depth (z), for events of M, =5.8-8.6. Ao=3 MPais used inthe calculations, and 
V,(z) is taken from Extended Data Fig. 3c. The colour scale indicates M,. b, Solid 
lines show calculated moment rate spectra for three events of M, =6.4 
(bottom), 7.4 (mid) and 8.4 (top). Black, blue and red lines correspond to V, at 
z=1km,6kmand25km, respectively, for each event. Coloured circles indicate 
corner frequency according to colour code in Extended Data Fig. 6a. Vertical 
dashed lines indicate periods of 250s and 20s, reference to calculate M,, 

and M,, respectively. In all cases, note the high-frequency depletion at 

shallow depths. 
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Extended Data Fig. 7 | Source duration of circum-Pacific megathrust 
earthquakes. a, White circles show scaled source duration of 525 moderate 
size (M, 5.0-7.5) shallow megathrust subduction earthquakes from around the 
circum-Pacific, as a function of depth. Black circles show the same source 
parameters for six large tsunami earthquakes. Dataare from ref... This article 
contains all the information onthe procedure followed to calculate the source 
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parameters. b, Red circles are the normalized source duration inaaveraged 
within a 2-km-thick sliding window. Blue circles correspond to relative rupture 
duration in Fig. 3b scaled to fit average normalized rupture duration within the 
regular domain ina (approximately 3.5 s), and shifted 4.5km downto 
compensate for the difference between depth below sea surface and depth 
below seafloor. Error bars are one standard deviation. 
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Extended Data Fig. 8 | Range of variation of M,, for agiven M,. White circles 


show M,, for events occurring at different interplate boundary depthb.s. that 
have the same spectral amplitude at 20 s (hence equivalent M,). 


Extended Data Table 1| Location of seismic profiles and margin type 


Number Longitude __Latitude Type Region Reference 


1 -86.7 10.3 E Nicaragua 41 
2 -84.13 8.47 E Costa Rica 48 
3 -80.0 1.85 E Ecuador 49 
4 -71.1 -22.2 E North Chile 50 
5 -71.3 -23.4 E North Chile 50 
6 -73.5 -34.5 A Central Chile 51 
7 -75.85 -45.55 A South Chile 52 
8 -175.1 -24.4 E Tonga 53 
9 94.7 2.45 A Sumatra 54 
10 124.5 22.6 E South Ryukyu 55 
11 135.1 32.3 A Nankai 56 
12 144.0 37.9 E Tohoku 57 
13 -125.5 44.7 A Cascadia 58 
14 -74.6 -38.0 A South Chile 38 
15 -81.1 -8.5 E North Peru 59 
16 -79.0 -11.6 E Central Peru 59 
17 -77.5 -13.5 E Central Peru 59 
18 -76.5 -15.3 E South Peru 60 
19 113.0 -10.65 E Java 61 
20 179.9 -38.6 E New Zealand 62 
21 160.9 -7.8 E Solomon 63 
22 142.25 32.25 E Izu Bonin 64 
23 -75.75 -44.5 A South Chile 52 
24 -87.6 11.2 E Nicaragua 65 
25 -86.4 9.7 E Costa Rica 66 
26 183.8 -29.5 E Kermadec 67 
27 122.15 23.35 E Taiwan 68 
28 106.3 -9.25 E Java 69 
29 116.0 -11.4 E Lombok 70 
30 118.9 -11.2 E Sumba 70 
31 -148.0 57.4 A Alaska 71 
32 -72.5 -31.0 E Central Chile 39 
33 146.6 42.0 E Kuril 72 
34 137.1 32.9 A Tonankai 73 
35 134.75 31.9 A Shikoku 74 
36 93.2 4.3 A Sumatra 75 
37 -72.65 -32.1 E Central Chile 39 
38 -59.1 16.8 A Antilles 76 
39 -84.5 8.75 E Costa Rica 77 
40 -71.2 -21.1 E North Chile 78 
41 -81.5 -1.3 E Ecuador 79 
42 -81.6 -2.6 E Ecuador 80 
43 -75.5 -42.8 A South Chile 52 
44 -157.0 53.5 A Alaska 81 
45 144.0 38.1 E Miyagi 82 
46 131.5 29.25 A North Rykyu 83 
47 127.5 24.7 E Central Rykyu 83 
48 126.6 23.95 E South Rykyu 83 


Geographical locations of the 48 wide-angle seismic profiles along the circum-Pacific included in the data set. The type of margin is indicated in the fourth column by an A (accretionary) or an E 
(erosional), which correspond to the red and yellow circles in Extended Data Fig. 1, respectively. Data taken from refs. °°4"®?, Reference number is indicated in the last column. 
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Extended Data Table 2 | Location and magnitude of circum-Pacific megathrust earthquakes 


Number Longitude Latitude Day Month Year Ms Mw Region Reference 
Largest Megathrust Earthquakes 


1 -73.407 -38.143 22 5 1960 - 9.5 Valdivia 84 
2 -147.34 60.91 28 3 1964 - 9.2 Alaska 85 
3 95.982 3.295 26 12 2004 - 9.1 Andaman 86 
4 142.373 38.297 11 3 2011 - 9.1 Tohoku 87 
5 -72.898 -36.122 27 2 2010 - 8.8 Maule 98 
6 178.715 51.251 4 2 1965 - 8.7 Rat Island 89 
Tsunami Earthquakes 
1 -87.38 11.75 2 9 1992 7.0 7.6 Nicaragua 12 
2 -80.90 -6.72 20 11 1960 675 7.6 Peru 22 
3 -80.23 -9.95 21 2 1996 - 7.5 Peru 90 
4 178.9 -38.8 25 3 1947 7.2 7.1 Hikurangi 91 
5 157.3 -8.3 3 1 2010 - 7.1 Solomon 92 
6 113.04 -10.28 2 6 1994 7.2 7.6 Java 27 
7 107.39 -9.26 17 7 2006 7.2 7.8 Java 93 
8 100.14 -3.49 25 10 2010 7.1 7.8 Mentawai 23 
9 144.0 39.5 15 6 1896 7.2 8.0 Sanriku 94 
10 147.63 43.07 10 6 1975 7.0 7.5 Kurile 22 
11 150.7 44.7 20 10 1963 7.2 7.8 Kurile 22 
12 163.1 53.32 1 4 1946 7.4 8.2 Aleutian 95 


Geographical location, date, and magnitudes of the circum-Pacific megathrust earthquakes shown in Extended Data Fig. 1. The list includes the six largest- magnitude earthquakes that have 
occurred since 1960, as well as 12 events identified as tsunami earthquakes. Data taken from refs. ”77°4®5. Reference numbers for the source parameters and energy release characteristics of 
all the events are indicated in the last column. 
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The evolution of the mammalian middle ear is thought to provide an example of 
‘recapitulation’ —the theory that the present embryological development of a species 


reflects its evolutionary history. Accumulating data from both developmental biology 
and palaeontology have suggested that the transformation of post-dentary jaw 
elements into cranial ear bones occurred several times in mammals?. In addition, 
well-preserved fossils have revealed transitional stages in the evolution of the 
mammalian middle ear’**. But questions remain concerning middle-ear evolution, 
suchas how and why the post-dentary unit became completely detached from the 
dentary bone in different clades of mammaliaforms. Here we report a definitive 
mammalian middle ear preserved in an eobaatarid multituberculate mammal, with 
complete post-dentary elements that are well-preserved and detached from the 
dentary bones. The specimen reveals the transformation of the surangular jaw bone 
from anindependent element into part of the malleus of the middle ear, and the 
presence of a restricted contact between the columelliform stapes and the flat incus. 
We propose that the malleus-incus joint is dichotomic in mammaliaforms, with the 
two bones connecting in either an abutting or an interlocking arrangement, reflecting 
the evolutionary divergence of the dentary-squamosal joint*. In our phylogenetic 
analysis, acquisition of the definitive mammalian middle ear in allotherians such as 
this specimen was independent of that in monotremes and therians. Our findings 
suggest that the co-evolution of the primary and secondary jaw joints in allotherians 
was an evolutionary adaptation allowing feeding with unique palinal (longitudinal 
and backwards) chewing. Thus, the evolution of the allotherian auditory apparatus 
was probably triggered by the functional requirements of the feeding apparatus. 


Mammalia Linnaeus, 1758 
Multituberculata Cope, 1884 
Eobaataridae Kielan-Jaworowska, Dashzeveg and Trofimov, 1987 
Jeholbaatar kielanae gen. et sp. nov. 


Etymology. /ehol derives from the Jehol Biota ecosystem of Cretaceous 
northeastern China; baatar (Mongolian), meaning hero, isacommon 
suffix for Asian Cretaceous multituberculate names; kielanae honours 
the Polish palaeontologist Zofia Kielan-Jaworowska for her contribu- 
tion to the study of multituberculates. 

Holotype. A nearly complete skeleton (IVPP V20778; Fig. 1), housed 
in the Institute of Vertebrate Paleontology and Paleoanthropology, 
Beijing, China. 

Locality and age. The specimen is from the Jiufotang Formation near 
Changzigou, Lingyuan City, Liaoning Province, China, dated to approxi- 
mately 120 million years ago’. 

Diagnosis. Dental formula of ?-C°-P*-M’/I,-Cy-P3-M, (I, incisor; C, canine; 
P, premolar; M, molar; superscript and subscript denote upper and 


lower teeth, respectively), with the following multituberculate char- 
acteristics (Extended Data Figs. 1, 2): cranium dorsoventrally com- 
pressed; masseteric fossa anteriorly extending below lower premolars; 
lingual offset of M? relative to M’; enlarged single lower incisor; blade- 
like P,; definitive mammalian middle ear (Extended Data Figs. 3, 4). 
Among multituberculates, Jeholbaatar is referable to eobaatarids on 
the basis of: upper canines absent; I? transversely wide; and eight ser- 
rations and a posterolabial cusp on P,.Jeholbaatar differs from most 
eobaatarids (except for Eobaatar and Heishanbaatar) in having eight 
serrations on P,; differs from Eobaatar in having reduced P,_,, more buc- 
cal cusps on M!, and aridged cusp row on P*; differs from Heishanbaatar 
in having an oval lateral outline of P; and more cusps of lower molars; 
and differs from Sinobaatar in having a posterior cuspule on I’, two 
cusp rows of P®, and different cusp counts of upper and lower molars. 

Phylogenetic analyses place /eholbaatar within the monophyletic 
eobaatarids and closely related to Sinobaatar (Extended Data Figs. 5, 6). 
The body mass of Jeholbaatar is estimated to be approximately 50 g 
on the basis of its skull length® (see Supplementary Information). 


'Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, Institute of Vertebrate Paleontology and Paleoanthropology, Beijing, China. ?Centre for Excellence 
in Life and Paleoenvironment, Chinese Academy of Sciences, Beijing, China. “Division of Paleontology, American Museum of Natural History, New York, NY, USA. “College of Earth and Planetary 
Sciences, University of Chinese Academy of Sciences, Beijing, China. *e-mail: wangyuanging@ivpp.ac.cn 
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Fig. 1| The Cretaceous multituberculate Jeholbaatar kielanae.a, Holotype 
(IVPP V20778) in dorsal view. b, Line drawing of the holotype. Grey shading 
indicates damaged elements. cal, first caudal vertebra; cl, atlas; ep, epipubis; 
il, ilium; If, left fibula; Ife, left femur; lh, left humerus; Imt, left metatarsals; Ir, 
left radius; ls, left scapulocoracoid; It, left tibia; Its, left tarsals; lu, left ulnar; pf, 
parafibula; ph, phalanges; rc, right clavicle; rcp, right carpals; rh, right 
humerus; rfb, right fibula; rfe, right femur; rl’, right I; rl, right lower jaw; rmc, 
right metacarpals; rmt, right metatarsals; rM/, right M}; rr, right radius; rs, right 
scapulocoracoid;rt, right tibia; ru, right ulna; st, sternum. 


Jeholbaatar is inferred to be scansorial on the basis of its manual and 

pedal morphology, and the phalangeal index of its third digits is the 
greatest among multituberculates, with postcranials preserved (Fig. 1, 
Extended Data Fig. 7 and Extended Data Table 1). Given the morphology 
of the lower cheek teeth, we infer that Jeholbaataris similar to Fobaatar 
in having an omnivorous diet, feeding on arthropods, worms and plant 
items’. The palinal jaw joint* and the distinct attachment for the mas- 
seter muscle suggest a unique palinal jaw movement while chewing 
(Extended Data Fig. 1). 

The well-preserved left middle-ear bones are mediodorsally exposed 
and articulated nearly in anatomical position (Fig. 2and Extended Data 
Figs. 3, 4). This unit is clearly detached from the dentary, as indicated 
by the absence of asulcus on the lingual side of the dentary, as in other 
multituberculates; thus, /eholbaatar has by definition the definitive 
mammalian middle ear (DMME)®. The stapes is columelliform-micro- 
perforate and distinct from the typical columelliform stapes of Lamb- 
dopsalis’ in having a robust shaft, aless expanded stapedial footplate 
(the proximal end of the stapes), and a more basally positioned stape- 
dial foramen (Extended Data Fig. 3). Laterally, the stapedial head—not 
exposed fully in dorsal view—is narrowed (relative to the proximal 
end) asin therian mammals, and articulates with the stapedial process 
(the long crus) of the incus through a restricted contact (relative to 
the broad end-on contact in other mammaliaforms”), preserving no 
sign of the extrastapes. The complete incus, previously unknownina 
multituberculate, is slightly displaced ventrally, revealing its shape 
and orientation. The incus body is flat and lies medial to the transverse 
portion of the malleus body. Its morphology and small size suggest 
that the incus may not contact the squamosal dorsally. The proximal 
portion of the anterior process of the malleus is dorsally thicker than 
the transverse portion of the malleus body. A foramen, presumably 
for the chorda tympani, perforates the malleus (Extended Data Fig. 3). 
The malleus body bears a short manubrium projecting anteroventrally. 
The ventrolateral part of the malleus is thick and wedge-shaped, which 
we interpret as the remnant of the surangular. It consists of an ante- 
rior projection and a convex posterior end (surangular boss; Fig. 2). 
The posterior portion of the surangular extends posterolaterally and 
the posterodorsal surface of the surangular boss remains smooth 
and restricted medially by a distinct neck, which is reminiscent of an 
articular surface. The ectotympanic is large and roughly sickle-shaped 


Fig. 2|Middle-ear bones of Jeholbaatar kielanae.a, Left middle-ear bones, 
slightly displaced between the right lower jaw and braincase. b, Left middle ear 
bones exposed in dorsal view. c, Interpretative drawing of left ear bones. 

d-f, Reconstructions of left middle ear bones, showing articulation of these 
elements, in dorsal (d), posterior (e) and ventral (f) views; the surangular, the 
malleus body and the anterior process of the malleus are combined as aunit. A, 
anterior; ap, anterior process of malleus; ct, foramen for chorda tympani; D, 
dorsal; et, ectotympanic; fp, footplate of stapes; in, incus; L, lateral; ma, body of 
malleus; mb, manubrium of malleus; sb, surangular boss; st, stapes. 


with a gently posteriorly curved ventral limb and no anterior limb. The 
posterior portion of the horizontal limbis slightly expanded medially. 
The ectotympanic connects firmly to the malleus, suggesting that they 
may function as a unit. 

The specimen provides important evidence regarding mammalian 
middle-ear evolution, revealing a unique configuration with more 
complete and complex components than those reported previously in 
Cretaceous multituberculates”. Under our phylogenetic framework, 
the DMME has evolved independently at least three times, in allotheri- 
ans, monotremes and trechnotherians (Fig. 3). 

Detachment of the auditory bones from the dentary was accompa- 
nied by loss of the anterior limb of the ectotympanic during develop- 
ment of the DMME, which evolved in parallel in monotremes, therians 
and allotherians (Arboroharamiya and Jeholbaatar)*”. The hook-like 
ectotympanic is plesiomorphic for early mammals, as demonstrated 
by Arboroharamiya and Jeholbaatar, contrasting with the ring-like 
form of the Early Cretaceous Ambolestes”. 

The incus-stapes complex has been simplified inJeholbaatar through 
a reduction in size and restricted incus—stapes contact. Whether the 
rod-like or the asymmetric bicrurate form represents the ancestral 
morphotype of the mammaliaform stapes is still disputed". Jeholbaatar 
reveals a transitional stage in the evolution of the stapes, intermediate 
between the rod-like form (observed in cynodonts, Arboroharamiya 
and Chaoyangodens*”’) and the typical columelliform morphology 
(with a slender shaft, as seen in Lambdopsalis’). Although there are 
different interpretations of some previously reported multituberculate 
stapes“, the robust shaft and less expanded footplate of the stapes in 

Jeholbaatar is distinct from the asymmetric bicrurate morphology 
(observed in Pseudobolodon"). This suggests several processes for 
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Fig. 3 | Evolution of the mammalian middle ear in different mammaliaform 
clades. This simplified phylogeny is based onthe strict consensus of 
parsimony analysis (see Extended Data Fig. 5 and Supplementary Information). 
The green arrows denote independent evolution of the DMME in 
mammaliaforms. Inthe second column from the right are the middle-ear bone 
complexes of different taxa; at the right are the corresponding diagnostic 


evolution of the stapes in mammaliaforms, with independent acquisi- 
tion of a bicrurate morphotype in Pseudobolodon and Kryptobaatar. 
The restricted incus-stapes contact of /eholbaatar is derived by com- 
parison with other mammaliaforms that have a broad end-on contact 
between these two bones. The development of the stapedial process of 
the incus, as an out-lever of the lever system during sound transition, 
is beneficial for the amplification of airborne sound!. 

Identification of the surangular in /eholbaatar reinforces the argu- 
ment that the remnant of the ancient ‘reptilian’ element exists in crown 
mammals (allotherians)*. It also fills a gap in the fossil record of the 
transformation of the surangular from an independent element to 
an accessory of the malleus*“, providing clues to the evolution of the 
surangular in mammaliaforms. In/eholbaatar, the manubrium of the 
malleusis short and gradually tapers anterolaterally from the malleus 
body. This is the plesiomorphic condition, lacking the clear distinction 
between the manubrial base and the manubrium observed in other 
known Mesozoic mammaliaforms that have preserved the middle-ear 
bones (except Arboroharamiya)*. Jeholbaatar also provides evidence 
of athickened malleus in a Mesozoic mammaliaform. This condition 
is defined specifically by the expression of the Bapx1 gene in mice”, 
implying similar embryonic development in/eholbaatar. 
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configurations of the abutting and interlocking systems of the malleus-incus 
complex: in the abutting system, the malleus and incus contact dorsoventrally; 
inthe interlocking system, this contact is rostrocaudal. Reconstructions of left 
middle ears are taken from the literature (see Methods and Supplementary 
Information). Graded colours in the key at the bottom denote Early, Middle and 
Late periods. Ma, million years ago. 


The malleus-incus complex of/eholbaatar is similar to that of Arboro- 
haramiya and extant monotremes*”°, with a dorsoventral contact of 
the malleus-incus complex. Given that the mammaliaform malleus— 
incus complex is derived from the primary joint in lower tetrapods, this 
raises an interesting issue concerning how the incus shifted dorsal to 
the malleus body during the transformation of the middle-ear bones. 
We propose that the articular configuration of the malleus-incus com- 
plex is dichotomic among mammaliaforms: the abutting system is 
characterized by a dorsoventral contact, as observed in monotremes, 
Arboroharamiya, andJeholbaatar*”®; and the interlocking system has a 
rostrocaudal contact (and later hingle-like articulation), as observed in 
Morganucodon", Liaoconodonand other mammals except allotherians 
and monotremes’. This interpretation of the malleus-incus articulation 
contradicts previous proposals regarding other multituberculates”®. 
However, in light of the unequivocal articulated middle-ear bones in 
Jeholbaatar, we postulate that the abutting system persisted in later 
multituberculate evolution. Whether this configuration is consistent 
in allotherians that have a mandibular mammalian middle ear (such 
as Haramiyavia) and transitional mammalian middle ear remains 
unknown. It has been proposed that the primary joint (malleus and 
incus) in mammals is determined by members of the Gdf gene family 


(GdfS and Gdf6)"*. If all of these hypotheses are correct, then the devel- 
opmental divergence of the primary joint (as reflected in the malleus— 
incus articulation) in mammaliaforms occurred deep in the Middle 
to Late Jurassic period, resulting in a shift in the position of the incus 
dorsal over the malleus (Extended Data Fig. 4). Despite the morpho- 
logical distinction of the middle-ear bones between Jeholbaatar and 
Arboroharamiya, the configuration of the abutting system is coincident 
with the palinal jaw joint in multituberculates and euharamiyidans. 
The timing of the divergence of malleus-incus configurations (the 
abutting and interlocking systems) and the dichotomic morphotype 
of the squamosal-dentary jaw joint (palinal and hinge-like)* supports 
the hypothesis that the primary and secondary jaw joints co-evolved 
inallotherians. 

The evolution of the DMME is associated with morphogenetic pro- 
cesses in the post-dentary bones, and causes of the detachment of 
Meckel’s cartilage are hierarchical’, Palaeontological and develop- 
mental findings have rendered two conventional hypotheses for the 
degeneration of Meckel’s cartilage (the brain-expansion hypothesis” 
and negative ontogenetic allometry of the middle-ear bones”*) less 
plausible*°**”°, Instead, given the evidence from Arboroharamiya 
and Jeholbaatar, the evolution of the DMME in allotherians might be 
explained by biomechanical functional constraints during feeding”’”®, 
with co-evolution of the primary and secondary jaw joints being an 
adaptation for the unique palinal chewing of allotherians. Earlier acqui- 
sition of the DMME inallotherians also implies a shortened transitional 
mammalian middle-ear stage. The abutting-system configuration per- 
mitted longitudinal and vertical reduction of the middle-ear bones in 
some mammaliaforms. Detachment of the middle-ear bones (followed 
by better handling of biomechanical loads related to mastication on 
the medial side of the dentary”’) and the abutting-system configura- 
tion could have increased the degree of food comminution per palinal 
power stroke in those allotherians with the DMME, and reduced the 
impact of feeding on the hearing apparatus. As such, selective pres- 
sure to detach the middle-ear bones (the hearing apparatus) in order 
to increase feeding efficiency could have been stronger in allotherians 
thanin clades characterized by the interlocking system, showing that 
feeding was an important trigger in DMME evolution. 

The homoplastic evolution of the DMME observed in fossils is consist- 
ent with developmental evidence, revealing diverse mechanisms for the 
detachment of Meckel’s cartilage in different lineages”°. The presence 
of the surangular remnant in/eholbaatar might represent a recapitula- 
tion of the ancestral state, and suggests that evolution of the DMME 
could bean instance of von Baer’s law of embryology*°—although this 
hypothesis requires further investigation in a developmental context. 
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Methods 


Specimen preparation 

At the early stage of preparation, the specimen was mainly exposed in 
dorsal view. After it was scanned using computed laminography, the 
skull was prepared from the backside of the slab to expose the skeletal 
morphology in ventral view. 


Measurements 
Skeletal elements were measured in Image]. 


Figures 

Middle ear reconstructions are based on the following references: 
Thrinaxodon, Morganucodon and Didelphis, ref. 7°; Hadrocodium, 
refs. 81; Pseudotribos, ref. **; Ornithorhynchus, ref.'°; Teinolophos, 
ref. °; Liaoconodon, ref.?; Haramiyavia, ref.**; Arboroharamiya, ref. *; 
Zhangheotherium, ref.*°; Ambolestes, ref. ?; Haldanodon, ref.*. 


Computed laminography 

We carried out scanning using a microcomputed laminography system 
(developed by the Institute of High Energy Physics, Chinese Academy 
of Sciences (CAS) at the Key Laboratory of Vertebrate Evolution and 
Human Origins, CAS). The specimen was scanned with a beam energy 
of 60 kV and a flux of 40 pA at a resolution of 8.7 pm per pixel, using 
a 360° rotation with a step size of 1°. We reconstructed a total of 360 
image slices with a size of 2,048 x 2,048 pixels using a modified Feld- 
kamp algorithm developed by the Institute of High Energy Physics, 
CAS. Three-dimensional reconstruction of the auditory bones and 
teeth was conducted in VGStudio 3.0. 


Taxonomic terminology 

We use the node-based concept for crown clades of Mammalia; the term 
‘mammaliaforms’ refers to taxa in Mammaliaformes”™. Given recent 
studies**®, we regard Allotheria as a monophyletic group, and we test 
this hypothesis in our phylogenetic analyses. The content of the clade 
Euharamiyida follows previous work**. 


Phylogenetic analysis 

We conducted two sets of phylogenetic analyses separately, using 
different data matrices to explore the placement of the new taxon in 
the mammaliaforms and multituberculates. The list of morphologi- 
cal characters for mammaliaform phylogeny follows ref. * (derived 
from refs. °°’), with separate analysis of two character matrices, A 
and B. We created a data matrix for multituberculate phylogeny analy- 
sis by adding new taxa and characters to expand the matrix in order 
to include 51 taxa and 130 characters on the basis of anewly published 
data matrix* (see Supplementary Information). Data matrices were 
edited in Mesquite v.3.03 and saved in NEXUS format for parsimony 
and Bayesian analysis. Bayesian analysis for mammaliaform or mul- 
tituberculate phylogeny was run for 100 million Markov Chain Monte 
Carlo generations, with the first 25% discarded as ‘burn-in’, using the 
Mkv model for discrete morphological data and a gamma parameter 
for rate variation in MrBayes 3.2 (ref. *!). Posterior probabilities were 
calculated to assess node robustness in MrBayes. Parsimony analysis 
was performed using TNT 1.5 with the New Technology Search method, 
implementing sectorial search, ratchet, drift and tree fusing, under 
equally weighted parsimony”. As is conventional for large datasets, 
200 ratchet iterations, 100 drift cycles and 10 rounds of tree fusion 
were applied to conduct comprehensive searches during phylogenetic 
analysis. Two separate parsimony analyses were conducted, one with 
all characters unordered and the other with 19 characters ordered for 


the multituberculate data matrix, respectively. These ordered char- 
acters are 17, 25, 26, 29, 31, 32, 43, 46, 47, 48, 49, 51, 52, 55, 58, 59, 61, 72 
and 85, as suggested previously****. Node support is given as Bremer 
support values in strict consensus of parsimony analysis, and as pos- 
terior probabilities (percentage) in 50% majority-rule consensus of 
Bayesian analysis. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The specimen (IVPP V20778) reported here is housed in the Institute 
of Vertebrate Paleontology and Paleoanthropology, Beijing, China. 
Character matrices are given in the Supplementary Information. 
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condyle, whichis positioned below the occlusal level of the lower molars and 
faces posteriorly inIVPP V20778. Together with the distinct masseteric fossa— 
which presumably provides attachment for a well developed masseteric 
muscle, inserting anteriorly below P,—the glenoid fossa produces a palinal 
(posterior) power stroke with distinct posterior chewing. 
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maintains the interlocking system (IS) arrangement (yellow arrow), witha 
rostrocaudal contact between these two elements. e, Left auditory bones of 
Liaoconodonin medial view (modified from ref. °). f, Left auditory bones of 
Morganucodon in medial view (modified from ref. *8). Here the incus (quadrate) 
has a medial trochlear facet to contact the concave surface of the malleus body 
(articular fossa) posteriorly. 
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Extended Data Fig. 5| See next page for caption. 
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Extended Data Fig. 5 | Strict consensus of parsimony analysis based on data returned; tree length, 2,539, consistency index, 0.338; retention index, 0.804. 
matrix A. Tree length, 2,622; consistency index, 0.327; retention index, 0.795. The blue shading shows the monophyly of allotherians within crown mammals. 
On the basis of analysis using TNT 3.0, 14 most parsimonious trees are Node supports are given as Bremer support values. 
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Extended Data Fig. 6 | Results of Bayesian analysis of multituberculates. 
This 50% majority-rule consensus was obtained from 10 million Markov Chain 
Monte Carlo generations with a 25% burn-in fraction. Node supports are listed 


63 
77 
75 
58 
81 
64 78 
74 
57 50 
78 98 
54 
85 
54 
85 
61 
89 
99 
100 
100 
95 89 
68 
99 


Sinoconodon 
Morganucodon 
Thomasia 
Haramiyavia 
Meketichoffatia 
Henkelodon 
Rugosodon 
Paulchoffatia 
Meketibolodon 
Guimarotodon 
Kuehneodon 
Ctenacodon 
Glirodon 
Bolodon 
Plagiaulax 
Sinobaatar 
Jeholbaatar 
Heishanobaatar 
Hakusanobaatar 
Eobaatar 
Liaobaatar 
Arginbaatar 
Cimexomys 
Boffius 
Meniscoessus 
Buginbaatar 
Cimolodon 
Ectypodus 
Ptilodus 
Neoliotomus 
Mesodma 
Catopsbaatar 
Kryptobaatar 
Chulsanbaatar 
Kamptobaatar 
Nemegtbaatar 
Eucosmodon 
Stygimys 
Microcosmodon 
Pentacosmodon 
Taeniolabis 
Catopsalis 
Lambdopsalis 
Sphenopsalis 
Prionessus 
Yubaatar 
Kogaionon 
Litovoi 
Barbatodon 
Zofiabaatar 
Pseudobolodon 


as posterior probabilities (percentages). The blue rectangle shows the 


monophyly of eobaatarids, with/eholbaatar closely related to Sinobaatar. 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7 | Manual and pedal structure, and ternary diagrams 
showing the proportions of phalanges from manual and pedal digit III. 

a, b, Shoulder (a) and pelvic (b) girdles in dorsal view. c, d, Right manus (c) and 
pes (d) in lateral view. e, f, Ternary plots showing ratios of metapodial 
(metacarpal or metatarsal), proximal and intermediate phalanges for 
Jeholbaatar digit III from the manus (e) and pes (f), and comparison with some 
extant terrestrial and arboreal mammals. The lengths of these three phalanges 
are shownas ratios of the combined length of these elements. Mc, metacarpal; 
Mt, metatarsal. The lengths of Jeholbaatar manus and pes elements (inmm, 
with asterisks indicating damaged elements) are: Mcl, 2.76; Mc II, *2.84; McIll, 
*3.70; McIV, *2.81; Mc V, 2.79; digit I proximal phalanx, 1.98; digit II proximal 
phalanx, 2.84; digit Il intermediate phalanx, *1.60; digit III proximal phalanx, 
2.40; digit IIl intermediate phalanx, 2.26; digit IV proximal phalanx, *2.22; digit 


IV intermediate phalanx, 1.83; digit V proximal phalanx, 1.92; digit V 
intermediate phalanx, 1.54; phalange index, that is, (proximal plus 
intermediate)/metacarpal, digit III, 126%; MtI, 3.92; Mt II, 4.99; Mt III, 5.42; Mt 
IV, *1.69; Mt V, *3.33; digit I proximal phalanx, 3.51; digit II proximal phalanx, 
3.58; digit Il intermediate phalanx, 2.82; digit III proximal phalanx, 3.59; digit III 
intermediate phalanx, 3.46; digit IV proximal phalanx, *1.73; digit IV 
intermediate phalanx, 3.25; digit V intermediate phalanx, 2.63; phalanx index, 
that is, (proximal+intermediate phalanges)/metatarsal, digit III, 130%. The 
manual proportion of /. kielanae places it closer (than the other 
multituberculates in the sample) to the arboreal category; the pedal 
proportion clusters mostly with arboreal taxa. The data for extant taxa are 
derived from ref. **. 
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Extended Data Table 1| Phalange indices for digit III of Jeholbaatar and comparison with other mammals 


Taxa Substrate Phalange Index 


Manual digit Pedal digit 
HT HI 


Maotherium sinensis Terrestrial 95% 99% 
Didelphis virginiana Terrestrial 98% 88% 
Eomaia scansoria Scansorial/Terrestrial 130% 129% 
Caluromys philander Scansorial/Terrestrial 138% 156% 
Arboroharamiya jenkinsi Arboreal 246% 216% 
Kryptobaatar Terrestrial 81% 

dashzevegi 

Rugosodon eurasiaticus Scansorial/Terrestrial 117% 114% 
Sinobaatar Scansorial/Terrestrial 108% 109% 
lingyuanensis 

Ptilodus kummae Arboreal 118% 
Eucosmodon sp. Arboreal 119% 


Jeholbaatar kielanae Scansorial/Terrestrial 126% 130% 
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Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
— AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
“—! Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Data collection for building character matrix used in this study is based on observation of specimens. 


Data analysis Segmentation was conducted in VGStudio v.3.0. Character matrix was compiled in Mesquite v.3.03. For phylogenetic analysis, parsimony 
analysis was conducted in TNT v. 1.5 and Bayesian analysis was conducted in MrBayes v. 3.2. Measurements were taken in ImageJ. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 


All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 
- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


The specimen reported in this study is housed in an academic institute and available for scholars to examine. Data matrices were provided in Supplementary 
Information. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


[| Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Ecological, evolutionary & environmental sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Study description It is a study on only one fossil specimen with phylogenetic analysis. 
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Research sample It is a fossil mammal specimen from the Lower Cretaceous of Northeast China. Phylogenetic analysis is based on character matrices 
that cover various taxa from all mammaliaform clades with emphasis on mutituberculates. 


Sampling strategy Fossil collecting in fieldwork and specimens preparation in lab. The taxon sampling and characters selected are extensive enough to 
reconstruct phylogeny for both mammaliaforms and multituberculates. 


Data collection Data collection includes observation of specimens with microscope in lab and computed laminography. 
Timing and spatial scale H.W. collected data from July, 2015 to June, 2018, based on observations of specimens. 
Data exclusions o data was excluded 


Reproducibility Phylogenetic analysis is repeatable, following the method for both parsimony analysis (in TNT) and Bayesian analysis (in MrBayes) in 
ethod section. 


Randomization /A. \t is a study on fossil material. 
Blinding /A. Character matrices are built on the basis of independent observation for each taxon. 
Did the study involve field work? Yes No 


Field work, collection and transport 


Field conditions Annual average temperature is 5.4°C ~ 8.7°C and annual precipitation is 450-480mm. 
Location Chaoyang City, Lioaning province, China. 


Access and import/export We investigate the fossil locality, prepare the specimen, and scan the specimen at the Institute of Vertebrate Paleontology and 
Paleoanthropology. 


Disturbance N/A 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 

XI |[] Eukaryotic cell lines DX]}L_| Flow cytometry 

L_| Palaeontology |_| MRI-based neuroimaging 


| Animals and other organisms 


Human research participants 


Clinical data 


Palaeontology 


Specimen provenance This specimen was discovered from the Jiufotang Formation in Changzigou site, Lingyuan county, Lioaning province, China. No 
permits needed for the work. 


Specimen deposition The specimen reported in this study is housed in the Institute of Vertebrate Paleontology and Paleoanthropology, Beijing, China. 


Dating methods o new dates for the specimen. 


Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information. 
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The underrepresentation of non-Europeans in human genetic studies so far has 
limited the diversity of individuals in genomic datasets and led to reduced medical 
relevance for a large proportion of the world’s population. Population-specific 


reference genome datasets as well as genome-wide association studies in diverse 
populations are needed to address this issue. Here we describe the pilot phase of the 
GenomeAsia 100K Project. This includes a whole-genome sequencing reference 
dataset from 1,739 individuals of 219 population groups and 64 countries across Asia. 
We catalogue genetic variation, population structure, disease associations and 
founder effects. We also explore the use of this dataset in imputation, to facilitate 
genetic studies in populations across Asia and worldwide. 


The underrepresentation of non-European individuals in human 
genetic studies! limits the applicability of the results for a large pro- 
portion of the world’s population’. Reference genome datasets* “are 
needed to characterize population-specific variation, enable efficient 
imputation of variants that are not directly genotyped, and extend 
genome-wide association studies (GWAS) to additional populations. 
The value of population-specific reference datasets is well recognized 
and projects based in the United States and Europe have provided deep 
characterization of specific populations (for example, Ashkenazi Jews” 
and individuals from the Netherlands? and Iceland”) and, in particular, 
data from individuals of Nordic countries have provided examples of 
how reference genome datasets can be used to drive comprehensive 
genetic studies across an entire population”. In Africa, populations 
show complex genetic patterns, smaller blocks of linkage disequi- 
librium and higher levels of heterozygosity, which provides unique 
value for genetic studies. Across the continent, early reference genome 
datasets for diverse populations are being built as part of H3Africa 
and other studies*’. A Korean reference genome as well as Japanese 
and Chinese reference genome datasets have been created, and the 
formation of large biobanks such as BioBank Japan” and the China 
Kadoorie Biobank”’ will accelerate the pace of discovery of disease 
associations across east Asia. 

Ashared recognition of the value of coordinated efforts and the need 
for reference genome datasets that would be useful for the complex 
populations of Asia has led to the formation of the GenomeAsia con- 
sortium (http://www.genomeasial00k.com). The consortium serves 
to facilitate and coordinate sequencing efforts among consortium 
members to maximize the value of the genomic sequence data that is 
produced and to facilitate efforts by national or other regional groups. 
Here we describe the GenomeAsia Pilot (GAsP) project, which consists 
of analyses of the whole-genome sequencing data of 1,739 individu- 
als from 219 population groups across Asia, with the ultimate goal of 
providing a useful genomic resource and facilitating genetic studies in 
Asia. We use the data that was generated in this pilot to analyse popula- 
tion structure and history, and as the basis for designing larger-scale 
genomic studies. Furthermore, we explore disease-associated locias an 
initial comparison of differences between populations. We show that 


the variant data produced by this project improve variant filtering for 
the discovery of disease-associated genes of rare diseases. We show that 
Asia has sizable founder populations and that further studies in these 
populations may be useful for the discovery of rare-disease-associated 
genes. We also report an initial survey of loss-of-function alleles found 
inthe GAsP project. 


The GAsP dataset 


For the GAsP project, we generated 1,267 high-coverage (average 
36x) whole-genome sequences and analysed these together with 596 
publicly available human genome sequences from previous sequenc- 
ing studies (Supplementary Information 1, 2 and Supplementary 
Tables la—c, 2a). The 1,739 samples were enriched for individuals 
from population isolates to capture the broadest wealth of genetic 
diversity; the dataset includes 598 sequences from India, 156 from 
Malaysia, 152 from South Korea, 113 from Pakistan, 100 from Mon- 
golia, 70 from China, 70 from Papua New Guinea, 68 from Indonesia, 
52 from the Philippines, 35 from Japan and 32 from Russia (Fig. la—c 
and Supplementary Table 1a-c). To facilitate comprehensive and 
comparative analysis of human genetic variation, we included 
sequencing data from African, European and American samples (Sup- 
plementary Table 1a, b). The sequenced samples originate from 7 
global regions, 64 different countries of origin and 219 population 
groups. About 80% of the samples come from Asia and emphasize 
population groups that are underrepresented in previous genetic 
studies (Fig. 1a, b, Supplementary Tables la—c, 2b and Supplemen- 
tary Information 1, 2). Each global region and population group was 
assigned a unique three-letter code for future reference (see Sup- 
plementary Table 1a for three-letter code designations). Within Asia, 
the sampling of many distinct population groups allowed us to ana- 
lyse the relationship between geography, physical characteristics 
and genetic variation. In south and southeast Asia, in particular, we 
sampled across diverse populations to gather new insights into how 
groupings defined on the basis of caste and language relate to genetic 
diversity, admixture with extinct hominins and other genetically 
described characteristics. 


*A list of participants and their affiliations appears in the online version of the paper. 
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Knowledge of the complex history of Asian populations informs optimal 
sampling for larger-scale biomedical sequencing efforts. We applied 
standard approaches for detecting recent positive selection, quantify- 
ing the population structure and inferring the history of the different 
populations, including principal component analysis’, multiple sequen- 
tially Markovian coalescent (MSMC)”, ADMIXTURE”, F,, uniparental 
analyses and the analysis of the Y chromosome and mitochondrial 
haplogroups (Fig. 2, Extended Data Fig. 1and Supplementary Infor- 
mation 3-10). Our results generally recapitulate the broad inferences 
of previous studies, and ADMIXTURE plots show complex structure 
within south and southeast Asia (Fig. 2a). In particular, India, Malaysia 
and Indonesia contain multiple ancestral populations as well as multiple 
admixed groups. On the basis of MSMC cross-coalescence rates, which 
reflect the increase in coalescence times of haplotypes sampled from 
different populations relative to haplotypes sampled from the same 
population”, we estimate that the oldest population splits in southeast 
Asia and Oceania involve Melanesians and/or Negritos, who show a 
substructure from approximately 40 thousand years ago and evidence 
of separation around 20-30 thousand years ago (Extended Data Fig. 1b 
and Supplementary Information 3). The population structure provides 
genetic information on classically defined population groups to aid 
future studies. For example, using multiple analytical approaches (Sup- 
plementary Information 3, 6), we confirmed that the anthropologically 
classified ‘Negrito’ groups from India, Malaysia and the Philippines, are 
genetically more closely related to their geographical neighbours than 
they are to other Negrito groups”, suggesting that dark skin colour is 
probably an environmental adaptation (for example, to high levels of 
solar radiation) and not an indicator of shared ancestry. 

Our dense sampling of Asian populations enables the examination 
of Denisovan admixture in greater detail than has been previously 
possible, providing information about population splits or in-flows 
that occurred at or after the time of admixture (Supplementary 
Information 10). Our estimates of Denisovan ancestry were highest 
in Melanesians and the Aeta, intermediate in the Ati and groups from 
the Indonesian island of Flores, and low (but still significantly greater 
than 0) inmost south, east and southeast Asian populations. We found 
high levels of Denisovan ancestry in Philippine Negrito groups but not 
in Malay or Andaman Negritos; these results are qualitatively similar to 
what was found ina previous study that was based on single-nucleotide 
polymorphism (SNP) arrays”’. The high levels of Denisovan ancestry 
in Melanesians and the Aeta are consistent with an admixture event 
into a population that is ancestral to both”’; however, two lines of evi- 
dence suggest that the ancestors of the Aeta experienced a second 


Fig.1| Sampling distribution of GAsP. 

a, b, Sample sizes. c, Location, language and social 
hierarchy associated with samples from south 
Asia. Groups with fewer than three samples are not 
plotted. See Supplementary Table 1a for 
definitions and descriptions of samples and 
population groups includedineach 
geographically defined set. 
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Denisovan admixture event. First, multiple analyses found that the 
Aeta are genetically more similar to populations without appreciable 
Denisovan ancestry (for example, Igorot, Malay and Malay Negrito 
groups) than they are to Melanesians (Supplementary Information 3, 
6). This can be explained by more recent gene flow from other popu- 
lations without Denisovan ancestry. However, such gene flow would 
reduce the levels of Denisovan admixture below that found in Melane- 
sians. More directly, we find that putative Denisovan haplotypes that 
are unique to the Aeta (n = 962) are significantly longer than putative 
Denisovan haplotypes shared between Aeta and Papuans (n = 596, 
mean = 16.1 kb compared with mean = 14.1 kb, Mann-Whitney U-test, 
P«10™°), or putative Denisovan haplotypes unique to Papuans (n=727, 
mean = 16.1 kb compared with mean = 14.9 kb, Mann-Whitney U-test, 
P«107°”) (Supplementary Information 10), supporting a scenario 
in which a second admixture event between the Aeta and Denisovans 
happened after the separation of the Aeta and Melanesians. Two distinct 
Denisovan admixture events are most consistent with Homo sapiens 
and Denisovans interacting within southeast Asia”, making it likely that 
admixture occurred within Sundaland (Fig. 2b) or even farther east”*”. 

Arecent study found aslightly increased amount of Denisovan ances- 
try in south Asians compared with a priori expectations”°. We exam- 
ined whether this was correlated with either language or social and/or 
caste status. South Asian samples were grouped into individuals who 
speak Indo-European languages and individuals who speak non-Indo- 
European languages (excluding individuals who speak Tibeto-Burman 
languages), as well as four social or cultural groups: tribal (Adivasi) 
groups, lower-caste groups, high-caste groups and Pakistani groups 
(Indo-European language speaking only). We found that the average 
levels of Denisovan ancestry were significantly different between the 
four social or cultural groups (Mann-Whitney U-test, P< 10° for all 
pairwise comparisons; Fig. 2c and Supplementary Information 10). Our 
results are consistent with the scenario that Indo-European-speaking 
migrants who entered the subcontinent from the northwest admixed 
with an indigenous South Asian (ancestral south Indian)””* group who 
had higher levels of Denisovan ancestry. 


Medical relevance 


We evaluated the use of GAsP dataset in disease-associated genetic stud- 
ies and medically relevant applications to determine how the results of 
larger continuing GenomeAsia studies can be used to improve human 
health (Supplementary Table 4a). We annotated high-quality variants 
using public databases including ExAC (Exome Aggregation Consor- 
tium)”’, gznomAD”, 1000 Genomes Project*, ESP (NHLBI GO Exome 
Sequencing Project)*° and dbSNP (Extended Data Fig. 2) and focused 
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Fig. 2 | Population structure and admixture. 

a, ADMIXTURE plots for k=12 and k=14 illustrating 
the identification of 12 reference groups. 

b, Proposed modern human migration route into 
southeast Asia during the Last Glacial Maximum 
with potential locations of Denisovan admixture 
(yellow asterisks). Green indicates the above water 
landmassat the glacial maximum and white 
outlines indicate present-day shorelines. 

c, Estimates of Denisovan ancestry in south Asians, 
stratified by social/cultural group and language. IE, 
Indo-European. Adivasi Indo-European, n=30; 
Adivasi non-Indo-European, n=196; caste Indo- 
European, n= 68; caste non-Indo-European, n=155; 
upper caste Indo-European, n=49; upper caste 
non-Indo-European, n=19; Pakistani Indo- 
European, n=79. The centre line indicates the 
median; box limits show the middle 50%; whiskers 
extend two standard deviations from the mean; 
points are outliers. 
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oncoding-sequence variants. Overall 23% of protein-altering variants 
in GAsP were not found in these data sources. As expected the majority 
of coding variants were singletons or very rare (Extended Data Fig. 2). 
However, the absolute numbers of novel variants with a minor allele fre- 
quency (MAF) = 0.1% within our pan-Asian dataset is large (n =194,585), 
and these are frequent enough to be of relevance for large-scale genetic 
association studies. We also searched for variants present at low fre- 
quency in the overall dataset that are present at significantly higher 
allele frequencies in one or more of the population groups. We found an 
additional 144,329 novel variants with MAF > 1% in the full GAsP dataset 
that were present at a frequency of greater than 1% within populations 
grouped by geography; South Asia, Southeast Asia, Northeast Asia 
or Oceania (see Supplementary Table 1a for description of samples 
and population groups included in each geographically defined set). 
These geographical regions contain many diverse population groups, 
and additional studies are needed to characterize patterns of genetic 
variation in these groups and disease relevance. 

In rare disease genetics, databases are used to filter based on allele 
frequency with the idea that commonalleles are unlikely to be respon- 
sible for rare highly penetrant disorders; however, in the absence of 
appropriate population reference datasets, allele frequencies can be 
misclassified and may lead to false disease associations”. We explored 
whether the GAsP variant dataset can improve the ability to identify 
disease-relevant variants in Asian cohorts. We analysed 152 exomes 
from individuals participating in the Indian Maturity Onset Diabetes in 
the Young (MODY) project. When both the gnomAD and GAsP datasets 
were used for filtering (MAF > 0.1%), we reduced the set of remaining 
candidate variants by approximately twofold in comparison to using 
the gnomAD dataset alone (Fig. 3a). In this process, we identified acom- 
mon population polymorphism in NEUROD1 (H241Q) that is probably 
benign but that was previously reported to be medically relevant”. We 
annotated variants that were identified in the GAsP dataset against the 
Human Gene Mutation Database (HGMD) disease-causing pathological 
and ClinVar pathogenic variants. This analysis identified 732 variants 
(686 SNPs and 46 insertions or deletions (indels)) in 514 genes (Fig. 3b, 
Supplementary Table 4b, c and Supplementary Information 11). We 
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compared the 732 pathogenic variants against the gnomAD, ExAC”’, 
1000Genomes*, ESP*°, dbSNP**, ALSPAC, TwinsUK® and 1000Japanese® 
databases to remove variants that occurred at >1%, focused on those 
with allele frequencies >0.15% in GAsP (38 variants), and reviewed them 
against the criteria defined by the American College of Medical Genetics 
(ACMG). This resulted in reclassification of 11 of the 38 variants (Sup- 
plementary Table 4d). We examined the geographical distribution of 
the remaining, revalidated but high-frequency, pathogenic disease- 
associated variants. As expected, most of these variants were highly 
enriched in Asia. For example, an HBB variant (chromosome 11: 5248155 
c.92+5G>C) associated with B-thalassaemia is found almost exclusively 
in south Asians and at a lower frequency in southeast Asians (Fig. 3c). 
We also examined our dataset for novel variants in genes known to be 
associated with cancer risk. We found 13 unique variants in 6 genes from 
17 samples. This included frameshift, stop-gained and essential splice- 
site mutations in BRCA2 (n= 9), BRCA1 (n=1), ATM (n=2), BLM (n=1), 
NBN (n= 2) and PMS2 (n= 2) (Fig. 3d and Supplementary Table 4e). Of 
the two PMS2essential splice variants, one was found ina Korean sam- 
ple. Loss-of-function mutations in PMS2 are associated with mismatch 
repair defects that lead to a higher risk of cancer development. Ina 
separate study of gall bladder cancer, we found the same essential splice 
site PMS2 mutation (chromosome 7:6043690C>G) ina Korean patient 
whose gall bladder cancer exhibits microsatellite instability (E.W.S. 
and S. Seshagiri, manuscript in preparation). Identification of genetic 
variants that affect drug efficacy and safety through the alteration of 
pharmacokinetics enables application of individualized treatment**™. 
Variation in drug responses are generally recognized and recommen- 
dations for dosing are sometimes guided by apparent or self-reported 
population identity despite the lack of a rigorous pharmacogenomic 
basis. We assessed the allele frequencies of key pharmacogenomic 
variants in our dataset to identify inter-population differences that 
have potential implications on drug testing and treatment (Fig. 3e, 
Supplementary Table 4g and Supplementary Information 13). 
Carbamezepine, clopidigrel, peginterferon and warfarin showed 
the largest variation between populations in predicted adverse drug 
responses with groups ranging from 0 and 100 predicted adverse drug 
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Fig. 3 | Disease-relevant variant discovery. a, Filtering using the GAsP dataset 
improves candidate variant discovery by removing population specific 
variants (n=152). The centre line indicates the median; box limits show the 
upper and lower quartiles; whiskers extend 1.5x the interquartile range. 

b, Allele count (AC) and frequency distribution of variants in the GAsP dataset 
that are designated disease-causing in the Human Gene Mutation Database 
(HGMD) or pathogenic in ClinVar. Autosomal-dominant (AD) or autosomal- 
recessive (AR) or other (unknown) classification as per OMIM. A number of 
variants (n =37) that had previously been reported to be pathogenic are found 


responses. For example, the HLA-B*15:02 variant, associated with risk 
for development of Steven Johnson syndrome’ in patients treated 
with carbamazepine was found to occur at an increased frequency in 
Austronesian language-speaking populations from southeast Asia (for 
example, 63% in the Mentawai of West Sumatra; 46.6% in the Nias of 
North Sumatra) compared with other groups (Supplementary Infor- 
mation 13). There are roughly 400 million individuals who belong 
to Austronesian groups that are at increased risk for carbamazepine 
sensitivity, including the vast majority of the people from Indonesia, 
Malaysia and the Philippines. 


Founder populations 

Population bottlenecks produce strong founder effects and increased 
rates of recessive disease. In populations with strong founder effects, 
the loss-of-function variant frequency spectrum is skewed higher, 
greatly increasing power of association” and providing unique advan- 
tages for the identification of genes associated with both rare and com- 
plex diseases**“*. We followed the approach described in a previous 
study on south Asian populations to characterize the degree to which 
genomic segments are inherited as identical by descent (IBD) in popula- 
tion groups in our dataset. 


in the GAsP study dataset at high frequency and were reclassified 
(Supplementary Table 4d). c, Frequency of B-thalassaemia variant 
(chromosome 11:5248155 c.92+5G>C) across Asia shows a geographical 
enrichment. MAF in South Asia is 1.4%. NA, not available. d, Novel cancer- 
predisposing variants identified in the GenomeAsia dataset. e, Population- 
specific probabilities of adverse drug reactions predicted from the aggregate 
allele frequencies of known variants associated with response to the indicated 
drugs. 


Our analysis revealed IBD scores of 1.465 and 0.817 for Finnish and 
British groups, consistent with previous analyses®. The IBD score of all 
of the groups was normalized relative to the Finnish group (Fig. 4a and 
Supplementary Information 12). Our study includes many groups with 
small population sizes and it is expected that endogamy paired with small 
population size will greatly increase IBD scores. We found that indigenous 
and tribal groups had IBD scores that were skewed upwards from non- 
tribal groups (Fig. 4b). Notably, we found that anumber of Asian groups 
with large urban populations have IBD scores above or close to that of the 
Finnish population. For example, samples from an outpatient hospital in 
Chennai, a city with a census size of 9 million, had an IBD score that was 
approximately 1.3 times greater than the score for the Finnish group. 


Human knockouts 

Homozygous loss-of-function alleles found in humans give us the 
opportunity to assess the phenotypic effect of specific gene loss and 
can provide important information about opportunities for treating 
disease*°*”. To assess the contents of our dataset, we examined high- 
confidence protein-truncating variants (PTVs). We found 17,566 PTVs 
with at least 1 PTV in approximately 43% of all protein-coding genes 
(n=8,766; Fig. 4c). Among the PTVs, most were heterozygous variants 
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Fig. 4 | Founder effects and homozygous loss of function. a, IBD scores across 
different population groups are shown for 96 ethnicities (1,417 samples) across 
global regions. The scores given in the figure are relative ratios compared to 
that of the Finnish group. b, Violin plot showing IBD scores in 29 tribal groups 
and 25non-tribal groups consisting of 293 and 336 samples, respectively. The 
centre line indicates the median; box limits show 1.5x the interquartile range. 


unique to our dataset (n= 8,799; Fig. 4d), similar to the PTV data from 
ExAC” (67% singletons). A smaller number were homozygous and had 
been reported in gnomAD, dbSNP or 1000 Genomes Project (n= 856). 
In addition, within our dataset were 121 homozygous PTVs that have 
not previously been reported (Supplementary Table 5). These novel 
homozygous PTVs were mostly found in groups with high IBD scores 
such as the Jarawa and Onge from the Andaman Islands (Fig. 4e). The 
novel homozygous PTVs include an allele of the ABCA7 gene, Q2010*, 
that is found in only the Aeta population (Fig. 4f). Heterozygosity for 
loss-of-function alleles of ABCA7 has been shown to increase suscep- 
tibility to Alzheimer’s disease in European populations*®. 


Imputation panel 

We carried out preliminary work to evaluate the utility of the pilot dataset 
forimputation. For this analysis, wedownsampled whole-genome sequence 
data from South Asian, Southeast Asian and Northeast Asian population 
groups (see Supplementary Table 1a for samples included in each of these 
geographically defined sets) 30x to the genotypes represented on the 
Illumina Global Screening Array v.1 genotyping array, and compared the 
imputation using either phase 3 of the 1OOO Genomes Project or the GAsP 
reference panels. We found, as described by Illumina, that imputation 
accuracy of the 1000 Genomes Project reference panel is consistently well 
below 90% for east Asian and south Asian samples whereas using the GAsP 
reference panel we achieved accuracies ranging from 93 to 95%. To acceler- 
ate evaluation and broad utility, we have placed the data on the Michigan 
Imputation Server (https://imputationserver.sph.umich.edu/index.html). 


Discussion 

Understanding the genetic basis of human disease will benefit from an 
increase in the number and scale of disease-association studies that are 
carried out in Asian populations. In the pilot phase of the GenomeAsia 
project, the sample set that we analysed allowed us to address a wide 
range of questions regarding the history of specific Asian population 
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c, Proportion of genes with at least one high-confidence PTV. d, Proportion 

of novel, known, heterozygous and homozygous PTVs in the GAsP dataset. 

e, Pie chart of novel homozygous PTVs plotted by region (inner circle) and 
population group (outer circle). Groups with less than two PTVs were grouped 
as other. f, Novel homozygous PTV Q2010* (green) found in ABCA7 localizes to 
the C-terminal ABC domain. Previously reported PVTs are shownin grey. 


groups and to map out strategies for additional sequencing efforts. 
We plan for a staged and coordinated approach, to include the genera- 
tion of genomic population-specific reference datasets and imputa- 
tion panels, and use this approach for the production of custom SNP 
arrays as a catalyst for disease-association studies. This approach is 
particularly useful in founder populations, such as recent studies in 
the founder populations of Finland”, as well as other populations. This 
will be particularly valuable in Asia’*°°, which has founder effects that 
have not only previously been demonstrated in isolated populations, 
but are also evident in major urban centres. 

Analysis of the GAsP dataset allows us to map out strategies for efforts 
focused on specific population centres in Asia as well as the generation 
of important tools that willincrease our understanding of how genetic 
variants affect disease susceptibility and drug responses. The dataset 
improves the ability to filter out low-probability candidates for highly 
penetrant disorders, to identify putatively pathogenic variants that 
are found at high frequency in particular populations and improve 
the ability to infer pathogenicity of identified variants. The identifica- 
tion of novel homozygous PTVs in this study expands the catalogue of 
genes in which homozygous loss of function appears to be tolerated 
and, when combined with phenotype information, this will provide 
important biological insights into gene function. The ability to define 
gene function in humans through the study of the phenotypic effects 
of loss-of-function mutations is becoming an increasingly valuable 
approach* and the study of additional variants and populations in 
which homozygosity occurs at high rates willadd tothe global resources 
for carrying out human knockout studies. 
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Methods 


Data reporting 

No statistical methods were used to predetermine sample size. The 
experiments were not randomized and the investigators were not 
blinded to the allocation during analysis. 


Samples 

We accessed publicly available high-coverage, whole-genome FASTQ 
files from previous studies of human genetic variation” * and com- 
bined these with 1,267 high-coverage genomes generated as part of 
this project. Full details on the samples chosen for sequencing and 
the informed consent processes for these samples can be found in 
Supplementary Information 1. We restricted our analyses to genomes 
generated using Illumina short-read sequencing technology. 


Whole-genome sequencing 

Whole-genome sequencing libraries were prepared using standard 
protocols (Illumina) and sequenced on Illumina Hiseq 2500/4000 or 
X10 machines. We obtained paired-end (2 x 100 bp or 2 x 150 bp) for 
each sample. 


Filtering, alignment and variant calling 
We aligned the Illumina short-read sequences to the GRCh37+decoy ref- 
erence genome with BWA-mem* using the default parameters. Putative 
PCR duplicates were flagged using SAMBLASTER™. The SAM outputs 
were converted to BAM format, and sorted by chromosomal coordinates 
using Sambamba’®, and all BAM files for the same samples were merged. 
The sex of the samples was inferred from the coverage of the auto- 
somes and the sex chromosomes, and confirmed from the submitted 
metadata with the samples. All samples that had an average coverage 
less than 20-fold or for which we found a difference in the inferred and 
reported sex were removed from further analysis. We used verifyBamID*” 
to identify contamination using the chip-free mode and samples for 
which swaps or contamination was identified were removed from subse- 
quent analyses. A contamination level of 3% was used as a cut-off, and this 
left us with 1,739 samples that were used for all downstream analyses. 
We identified the single-nucleotide substitutions and small indels 
variants in the 1,739 samples using the reference model (gVCF-based) 
workflow for joint analysis in GATK®. Variants were called individu- 
ally in each sample using the HaplotypeCaller in ‘-ERC GVCF’ mode to 
produce a record of genotype likelihoods and annotations at each site 
in the genome. Multi-allelic variants are reported in the GenomeAsia 
browser but were not included in the analysis. A gVCF file was created 
for every sample, and a subsequent joint genotyping analysis of all 
gVCFs was done to identify the variants in the cohort. We followed the 
GATK-recommended best practices for variant recalibration to create 
a final VCF file and recalibrated the variants to select 99% of the true 
sites from the training set for VQSR™. The VCF files were zipped using 
bgzip and indexed using tabix. 


Identification of first-degree relative pairs 

Several of the reported analyses require filtering to remove related 
samples. We used KING” to identify such first-degree relative pairs. We 
first used vcftools® and plink™ to convert the VCF file into the required 
input format for KING. The estimated kinship coefficient was restricted 
to 0.177-0.354 as described in the KING manual to identify the first- 
degree relative pairs, and the results were confirmed from the submit- 
ted metadata. The number of unrelated samples by country-of-origin 
is shown in Supplementary Table 1.1. 


Quantifying population structure and changes in population size 
We restricted our attention to 7,966,132 autosomal markers (that is, 
SNPs) with MAF > 0.01 and call rate > 98%. In some analysis, severe link- 
age disequilibrium pruning was applied as follows: sliding windows of 


size 50 (that is, the number of markers used for linkage disequilibrium 
testing at a time) and window increments of 5 markers; for any pair of 
SNPs ina window, the first marker of the pair was discarded if r? > 0.2. 
After linkage disequilibrium pruning, 1,089,227 SNPs were retained for 
analysis. All data-filtering procedures were conducted in PLINK v.1.9%. 

Analyses of population structure was performed using the quality- 
control-positive linkage-disequilibrium-pruned set of 1,089,227 autoso- 
mal SNPs. Principal component analysis (PCA)'* was conducted across 
all available populations in EIGENSTRAT v.6.1.4. Results were visualized 
in Tableau v.9.3. We applied unsupervised hierarchical clustering of 
individuals using the maximum likelihood method implemented in 
ADMIXTURE v.1.3.0”° using default input parameters. The ‘--cv’ flag was 
adopted to perform the cross-validation procedure and to calculate 
the optimal k value. 

Weused MSMC?° to estimate changes in population size and split times. 
Thisanalysisusedtwodifferentphasedgenomedatasets(usingShapeitv.2° 
and Eagle2°). The details for the phasing are described in Supplemen- 
tary Information 4. Chromosome 6 was excluded from the analysis 
owing to possible phasing errors in the HLA region. We used four hap- 
lotypes (two individual genomes) for estimating changes in population 
size in a population and eight haplotypes (two genomes from each of 
a pair of populations) for the estimation of population split times. We 
assumed a mutation rate of 1. =1.25 x 10° per site per generation and an 
average generation time of 29 years, as in previous studies®*”. 


Comparison with 1000 Genomes Project genotype calls 

We filtered the variant calls to include only biallelic SNPs with <10% 
missing genotype calls that were within the 1000 Genomes Project 
strict mask (available at ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/ 
release/20130502/supporting/accessible_ genome_masks/20141020. 
strict_mask.whole_genome.bed). Then, for each of the 119 overlapping 
samples considered individually, we calculated variant discordance 
rates for those filtered SNPs that (1) had a genotype call in both the 
1000 Genomes Project data and the GAsP data; and (2) had a ‘variant’ 
call (that is, anon-homozygous reference genotype call) in at least one 
of the datasets. These discordance rates were then stratified by the 
estimated MAF in the GAsP dataset. 


Patterns of allele sharing 

We used a parsimony-based analysis of allele sharing® that focused 
on SNPs that were not present in sub-Saharan Africans or in archaic 
humans (further details are provided in Supplementary Information 8). 


Archaic admixture 

We used a method similar to the ‘enhanced’ D-statistic approach®” to 
estimate levels of Neanderthal and Denisovan ancestry in each non- 
Africansample. The estimates were calibrated assuming 0% Denisovan 
ancestry in the British population, 4% Denisovan ancestry inthe Papuan 
population and 2% Neanderthal ancestry in the British population (full 
details are provided in Supplementary Information 9). 


Determination of high-quality variants for medically related 
analyses 

High-quality variants were defined as variants that (1) had a read- 
depth = 5 and genotype-quality > 20; (2) were contained in the high- 
confidence regions as described by Genome ina Bottle (ftp://ftp-trace. 
ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HGOO1/NISTv3.3.2/ 
GRCh37/supplementaryFiles/HGOO1GRCh37_GIAB_highconf_CG- 
IIIFB-IIIGATKHC-lon-10X-SOLID_CHROM1-X_v.3.3.2_highconf.bed) 
and (3) passed the gnomAD Filter. Variant annotation was carried out 
using SnpEff® (v.4.1). 


IBD scores 


Groups with at least two samples were considered for analysis. We 
restricted our analysis to genomic regions with high-confidence calls 
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and removed related samples based on reported relationship, kinship, 
PCA and IBD analyses. The scores given in the figure are relative ratios 
compared to that of the Finnish group. 


PTVs 


PTVs are defined as high-quality variants that were annotated as having 
a strong impact on the protein (such as frameshifts, essential splice 
sites or premature stop codons). We restricted calls to high-confidence 
regions determined by Genome ina Bottle as described above and 
filtered for high-confidence PTVs using the LOFTEE program”. We 
used a similar strategy for additional filtering of variants as proposed 
previously” and flagged variants with <7 reads covering the variant site; 
<80% of reads had the variant, were not in the bottom 1 percentile of 
phyloP or gerpRS® scores and for which the affected transcripts made 
up less than 50% of all expression as specified by GTEx. 


Enriched medically relevant variants 

We compared variant allele counts for Asian and Oceania samples 
from the GenomeAsia cohort to allele counts present in non-Asian 
gnomAD samples (European (non-Finnish), European (Finnish), Latino, 
African or other) for variants found in a set of 124 medically relevant 
genes. The genes used were 115 genes used for prenatal screening” 
as well as the cancer-associated genes BRCA1, BRCA2, TP53, MEN1, 
MLH1, MSH2, MSH6, PMS1 and PMS2A. A Fisher’s exact test was used 
to calculate variations that were significantly overrepresented in the 
GenomeAsia subsamples and corrected for multiple testing using the 
Bonferroni method. We further accessed variants for these genes that 
had not previously been reported. All variants were further filtered as 
being damaging as determined by having a high impact on the pro- 
tein (stop codon, essential splice site or frameshift mutation) or were 
predicted to be damaging by the Polyphen2 program. A cumulative 
comparison of allele counts for all over-represented and novel variants 
was performed and compared to non-Asian gnomAD to calculate a P 
value, odds ratio and relative difference in cumulative allele frequency 
(GenomeAsia cumulative allele frequency minus gnomAD non-Asian 
allele frequency). Reported Pvalues were corrected for multiple testing 
using the Bonferroni method. 


Reporting summary 


Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


For each variant, summary data for genotype quality, allele depth and 
population-specific allele counts were calculated before removing all 
genotype data. This dataset is available without requirement for login 
or other form of restriction for browsing or for download at https:// 
browser.genomeasial0Ok.org. Individual level VCF data files represent- 
ing the 1,180 newly sequenced genomes from individuals of 74 popula- 
tion groups are freely available to any qualified investigator without 
restriction. Chinese samples sequenced were from Corriell cell lines 
and are not subject to Chinese government regulation. The data are 
also available from the European Genome Archive (EGA) under acces- 
sion number EGASO0001002921. The procedure for accessing indi- 
vidual level data are as follows: access forms can be obtained from the 
GenomeAsia website (https://browser.genomeasial00k.org), and once 
filled out and sent to dataaccess@genomeasial00Ok.org the request 
will undergo administrative review and instructions for downloading 
the data will be returned to the requestor. Access to individual level 
data from Malaysian samples are subject to additional restrictions. 
The complete dataset of sequences of unrelated individuals (1,667 
samples) has been phased and can be used for imputation through 
the Michigan Imputation Server (https://imputationserver.sph.umich. 
edu/index.html). The goal of the GenomeAsial100K consortium is to 


facilitate and accelerate genetic studies in Asian populations by coor- 
dinating sequencing efforts among its members. To achieve this goal, 
we are committed to continuing to make data publicly available and 
accessible. As data are contributed to the consortium by individual 
members, it will be made immediately available in summary form or 
as imputation reference panels where appropriate. Data will be made 
available in individual form wherever possible and not limited by the 
bounds of informed consent, national privacy laws and regulations, 
or other external restrictions that may apply. 
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Extended Data Fig. 1| Diversity and divergence times of GAsP samples. 
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a, PCA plot of study samples. Africa (AFR), 2 =102; West Eurasia (WER), n=111; 
South Asia (SAS), n= 642; Southeast Asia (SEA), n= 162; Oceania (OCE), n= 68; 


Northeast Asia (NEA), n=346; Americas (AMR),n=26. Thesamples includedin 


each of these geographically defined groups are described in Supplementary 


YRI-BAI (41/71/120 kya) 
YRI-GBR (40/53/91 kya) 
GBR-BAI (25/34/58 kya) 
BAI-ONG (24/32/42 kya) 
ONG-AET (18/24/32 kya) 
GBR-PNY (18/24/33 kya) 
KEN-IGO (17/17/23 kya) 
AET-KEN (12/17/24 kya) 
PNY-TOD (9/18/24 kya) 
KOR-IGO (12/16/22 kya) 
KOR-BUR (8/12/16 kya) 


5 10 20 +30 50 70 100 200 ~=ikya 


Table 1a. b, MSMC cross-coalescence rates showing divergence time estimates 
between different groups. The point estimate of the date was given at which 
25%,50% and 75% of lineages in the pair of populations have coalesced intoa 
commonancestral population. 
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Extended Data Fig. 2| Characteristics of GAsP SNPs and indels. GAsP dataset.c, d, The number and lengths of small indels in the genome (c) or 
a, b, Comparison ofall GAsP variants (a) or coding variants (b) with gnomAD, coding regions (d). e-h, Proportion of non-coding (e, g) or coding (f, h) indels 


ExAC, 1000 Genomes, ESP and dbSNP data asa function of the MAF within the that were singletons (e, f) or rare (allele frequency of <0.1%; g, h). 
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n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


a The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


— For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 
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Data collection no software was used 


Data analysis BWA version 0.7.13 (https://github.com/Ih3/bwa); 
SAMBLASTER version 0.1.22 (https://github.com/GregoryFaust/samblaster) Sambamba version 0.6.1 (https://github.com/lomereiter/ 
sambamba) BAMreport version 0.0.2; (https://github.com/aakrosh/BAMreport) verifyBamID version 1.1.3 (http:// 
genome.sph.umich.edu/wiki/VerifyBamID); GATK version 3.5 (https://software.broadinstitute.org/gatk/); 
vcfanno version 0.1.0-dev (https://github.com/brentp/vcfanno); 
htslib version 1.3.1-64-g74bcfd7 (https://github.com/samtools/htslib); vcftools version 0.1.14 (https://vcftools.github.io/index.html); 
plink version 1.90b3.40 (http://zzz.bwh.harvard.edu/plink/); king version 1.4 (http://people.virginia.edu/~wc9c/KING/); rtg-tools version 
3.7 (https://github.com/RealTimeGenomics/rtg-tools); 
Shapeit v2 (Delaneau et al, 2012); 
ex- tractPIRs (Delaneau et al, 2013); 
Eagle2 algorithm (Loh et al. 2016), version 2.3; 
generate_multihetsep.py, downloaded from https:// github.com/stschiff/msmc-tools; 
Admixture v.1.3.0 (Alexander et al, 2009); 
EIGENSTRAT v.6.1.4 (Price et al, 2006); 
Selscan v. 1.1.0 (Szpiech and Hernandez 2014); 
BEAST v.1.8.4 (Drummond et al. 2012); 
PLINK v1.9 
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We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


For each variant, summary data for genotype quality, allele depth and population specific allele counts were calculated before removing all genotype data. This 
data set is available without requirement for login or other form of restriction for browsing or for download at (https://browser.genomeasia100k.org). Individual 
level VCF data files representing 1,180 newly sequenced genomes from individuals in 74 population groups are freely available to any qualified investigator without 
restriction. Chinese samples sequenced were from Corriell cell lines and are not subject to the Chinese regulation. The data are available from the European 
Genome Archive (EGA) under accession number EGASO0001002921. 


The procedure for accessing individual level data is as follows: 

Access forms obtained from the GenomeAsia website (https://browser.genomeasia100k.org), once filled out and returned to dataaccess@genomeasia100k.org will 
undergo administrative review and instructions for download will be returned to the requestor. Access to individual level data from Malaysian samples are subject 
to additional restrictions. 

The complete data set of sequences of unrelated individuals (1,667 samples) has been phased and can be used for imputation through the Michigan Imputation 
Server (https://imputationserver.sph.umich.edu/index.html) 

The goal of the GenomeAsia100K consortium is to facilitate and accelerate genetic studies in Asian populations by coordinating sequencing efforts amongst its 
members. To achieve this goal we are committed to continuing to make data publicly available and accessible. As data is contributed to the consortium by 
individual members it will be made immediately available in summary form or as imputation reference panels where appropriate. Data will be made available in 
individual form wherever possible and not limited by the bounds of informed consent, national privacy laws and regulations, or other external restrictions that may 
apply. 
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The evolutionary processes that drive universal therapeutic resistance in adult 
patients with diffuse glioma remain unclear’. Here we analysed temporally separated 
DNA-sequencing data and matched clinical annotation from 222 adult patients with 
glioma. By analysing mutations and copy numbers across the three major subtypes of 
diffuse glioma, we found that driver genes detected at the initial stage of disease were 
retained at recurrence, whereas there was little evidence of recurrence-specific gene 
alterations. Treatment with alkylating agents resulted in a hypermutator phenotype 
at different rates across the glioma subtypes, and hypermutation was not associated 
with differences in overall survival. Acquired aneuploidy was frequently detected in 
recurrent gliomas and was characterized by IDH mutation but without co-deletion of 
chromosome arms 1p/19q, and further converged with acquired alterations in the cell 
cycle and poor outcomes. The clonal architecture of each tumour remained similar 
over time, but the presence of subclonal selection was associated with decreased 
survival. Finally, there were no differences in the levels of immunoediting between 
initial and recurrent gliomas. Collectively, our results suggest that the strongest 
selective pressures occur during early glioma development and that current therapies 
shape this evolution in a largely stochastic manner. 


Diffuse glioma is the most common malignant brain tumour in adults 
and invariably relapse despite treatment with surgery, radiotherapy 
and chemotherapy. The molecular landscape of glioma at diagnosis has 
been extensively characterized® ®. Although these efforts have led to 
the identification of driver genes and clinically relevant subtypes”, 
how the glioma genetic landscape evolves over time and in response 
to therapy is unknown. 


Intratumoral heterogeneity is a well-recognized characteristic of glio- 
mas and results from selective pressures suchas a limited availability of 
nutrients, clonal competition and treatment” ©. Tumours are thought 
to circumvent these growth bottlenecks by dynamic competition of 
subclones that result in the most favourable environment for tumour 
sustenance’. Recent studies have suggested that stochastic changes in 
clone frequency (that is, neutral evolution) and immune surveillance 


A list of affiliations appears at the end of the paper. 
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may further contribute to the observed intratumoral heterogeneity”. 


An understanding of evolutionary dynamics at several time points is 
needed to develop strategies aimed at delaying or preventing the onset 
of tumour progression. 

To investigate clonal dynamics over time and in response to thera- 
peutic pressures, we established the Glioma Longitudinal Analysis 
(GLASS) Consortium. GLASS is acommunity-driven effort that seeks 
to overcome the logistical challenges in constructing adequately 
powered longitudinal genomic glioma datasets by pooling datasets 
from patients treated at institutions worldwide’®. We have analysed 
longitudinal profiles across the three molecular glioma subtypes to 
identify the molecular processes active at initial and recurrent time 
points. These analyses identified few common features of glioma evolu- 
tionacross subtypes, and instead pointed towards highly variable and 
patient-specific trajectories of genomic alterations. 


GLASS cohort 


We pooled existing and newly generated longitudinal DNA sequencing 
datasets from 288 patients treated at 35 hospitals (Supplementary 
Table 1, Extended Data Fig. 1). After applying quality filters, tumour 
samples from 222 patients with high-quality data in at least two time 
points were classified according to molecular markers into three major 
glioma subtypes: (1) IDH-mutant and chromosome 1p/19q co-deleted 
(hereafter referred to as IDH-mutant-codel; n = 25); (2) IDH-mutant 
without co-deletion of chromosome 1p/19q (hereafter IDH-mutant- 
noncodel; n = 63); and (3) IDH-wild-type (n = 134), in alignment with 
the World Health Organization (WHO) classification of tumours of 
the central nervous system’. For each patient, we selected two time- 
separated tumour samples, henceforth termed initial and recurrence, 
for further analysis. 


Mutational burdens and processes over time 


We first evaluated temporal changes in mutational burden and pro- 
cesses to understand general patterns of glioma evolution. Mutation 
burdens in initial tumours were comparable with previously reported 
rates®””. There were 2.20 mutations (single-nucleotide variants and 
small insertions or deletions) per megabase (Mb) for IDH-mutant- 
codels; 2.52 mutations per Mb for IDH-mutant-noncodels; and 2.85 
mutations per Mb for IDH-wild-type glioma (Fig. 1a, Extended Data 
Fig. 2a). Excluding DNA hypermutation cases (more than 10 muta- 
tions per Mb, n=35), the mutation burden increased after recurrence 
in 70% of the cohort (Extended Data Fig. 2a). To study changes during 
tumour progression, we separated mutations into three fractions: 
initial only, recurrence only, or shared. Notably, the mutation burdens 
of the private fractions, but not the shared fraction, were comparable 
between subtypes (Extended Data Fig. 2b). Patient age at diagnosis 
was significantly associated with the shared mutational burden (P= 
1.7 x10’), and toa lesser extent with the burden of mutations private 
to the initial tumour (P= 0.0256) (Extended Data Fig. 2c). On average, 
a longer time to recurrence was associated with a larger increase in 
mutation burden (P= 0.0043, Extended Data Fig. 2d). 

These fraction-specific differences in mutational burden sug- 
gested that the activity of distinct mutational processes may also be 
time-dependent. We therefore classified mutations in each fraction 
according to the Catalogue of Somatic Mutations in Cancer (COSMIC) 
signature database”’. As expected, signature activity was closely related 
to subtype and fraction (Fig. 1b, Extended Data Fig. 3a). Signature 1 
(ageing) was nearly always the dominant signature among shared 
mutations in IDH-wild-type tumours, whereas the shared fraction in 
IDH-mutant-noncodel and IDH-mutant-codel tumours—tumour sub- 
types that are associated with a younger age of diagnosis—also showed 
a strong presence of signature 16 (unknown aetiology). Signatures 3 
(double-strand break repair), 15 (mismatch repair) and 8 (unknown 


aetiology) were mostly confined to the private fractions, which suggests 
that these processes were of lesser importance to tumour maintenance 
than those associated with ageing. 

The treatment of glioma includes alkylating agents that can induce 
hypermutations after treatment” ~’. We observed enrichment of the 
associated signature 11 in recurrent tumours treated with alkylating 
agents and with a mutational load exceeding 10 mutations per Mb 
(Fig. 1a, Extended Data Fig. 3b). Treatment-associated hypermuta- 
tion occurred most frequently among IDH-mutant-noncodels (47%), 
followed by IDH-mutant-codels (25%), and IDH-wild-type gliomas 
(16%) (Fig. 1c). The proportion of hypermutation events was signifi- 
cantly different between the three glioma subtypes (Fisher’s exact test 
P=2.0 x 10°), which suggests that IDH-mutant-noncodels are most 
sensitive to developing a hypermutator phenotype”. 

Treatment-induced hypermutation has been associated with dis- 
ease progression”’. We did not find any differences in overall survival 
between hypermutators and non-hypermutators treated with alkylat- 
ing agents independent of age, subtype and MGMT methylation status 
(Fig. 1d, Supplementary Table 2a, b). To assess the pathogenicity of 
acquired mutations further, we studied their clonality”. Newly acquired 
clonal mutations have penetrated most of the tumour (that is, aselec- 
tive sweep) between initial and recurrence and mark clonal expan- 
sion’. Conversely, acquired subclonal mutations are less prevalent, 
and therefore less likely to drive disease progression. Previous reports 
have suggested that mutations associated with alkylating agents are 
frequently clonal’’. We found that in 48% of hypermutated tumours, 
most of the recurrence-only mutations were clonal, potentially reflect- 
ing cases in which a selective sweep occurred (Extended Data Fig. 4a). 
However, IDH-mutant-noncodel hypermutators with predominantly 
clonal mutations did not show differences in survival compared with 
those containing predominantly subclonal mutations (log-rank test 
P=0.38, Extended Data Fig. 4b). Alkylating agents such as temozolo- 
mide prolong the survival of adult patients with glioma”>”’. Our results 
show that treatment-induced hypermutation is common across sub- 
types and does not associate with reduced overall survival, supporting 
the noted benefit of alkylating agent therapy. 


Selective pressures during glioma evolution 


Environmental and treatment-induced pressures may drive changes 
in clonal architecture at recurrence. To evaluate selection over time, 
we clustered copy number changes and mutations on the basis of their 
cancer cell fraction (CCF). CCF values represent the fraction of cancer 
cells that contain a given alteration and reflect the relative timing of 
events, because alterations that are present in a subset of cancer cells 
probably occurred later than events present in all cancer cells (Fig. 2a). 
Most tumours (84%) demonstrated a mutational cluster with a CCF 
greater than 50% that persisted from the initial tumour to recurrence, 
probably reflecting the tumour trunk and containing the tumour-initi- 
ating driver mutations” (Fig. 2b, Extended Data Fig. 5a). To determine 
changes in clonal dominance over time, we ranked clusters within each 
sample by their CCF value and found similarities in clonal architecture 
throughout the course of disease (Kendall rank correlation, tau = 0.20, 
P=3.76 x10; Fig. 2b, Extended Data Fig. 5b-d). These results sug- 
gested that the clonal structure at initial disease mostly persisted into 
recurrence. 

To deepen our assessment of selective pressures, we evaluated selec- 
tion in initial and recurrent tumours by determining the normalized 
ratio between non-synonymous and synonymous mutations (dN/dS)*. 
Higher ratios (above one) suggest positive selection, and ratios less 
than one suggest negative selection. We found evidence for positive 
selection at both time points despite differences between subtypes 
(Fig. 2c). Separating mutations into mutational fractions demon- 
strated that shared but not private mutations showed positive dN/dS 
ratios in all three glioma subtypes, which indicates that only shared 
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Fig. 1| Temporal changes in glioma mutational burden and processes. 

a, Eachcolumn represents a single patient (n = 222) at two separate time points 
grouped by glioma subtype and ordered left-to-right by decreasing mutation 
frequency at recurrence. Top, mutation frequency differences between initial 
and recurrent tumours. Blue dotted line indicates increased mutation 
frequency, and the red dotted line indicates decreased mutational frequency. 
Middle, the proportion of total mutations shared (mustard), private to initial 
(magenta), or private to recurrence (blue). Bottom, clinical information 
including hypermutation status, therapy and grade changes. RT, radiation 


mutations (including truncal mutations) are likely to be subject to 
positive selection (Fig. 2c). The dN/dS ratio of initial-only mutations 
showed that these are neither positively nor negatively selected for, 
whereas recurrence-only mutations were subject to negative selection 
in IDH-wild-type gliomas. 

To verify the reduced selective pressure in the private mutations, 
we used an orthogonal method to test for evidence of selection™. The 
method uses distributions of variant allele frequencies and estimated 
mutation rates to detect whether profiles significantly deviate from 
a model of neutral evolution (that is, as depicted by a linear relation- 
ship in Fig. 2d). In accordance with results of the dN/dS ratios, private 
mutations demonstrated dynamics that were consistent with neutral 
evolution (Fig. 2d). Shared subclonal mutations deviated from lin- 
earity and were consistent with selection both in non-hypermutators 
and hypermutators (Fig. 2d, Extended Data Fig. 6a, b), which provides 
further evidence that the strongest selective forces occur early in glio- 
magenesis. 

Cohort-level analysis of selection masks the heterogeneity that 
exists in individual evolutionary trajectories. To determine the selec- 
tive effects at each tumour time point, we used a Bayesian frame- 
work (SubClonalSelection algorithm) that simultaneously provides 
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therapy. Alkyl, alkylating agent. b, Stacked bar plot (n= 219) indicating the 
dominant mutational signature among initial, recurrent and shared mutation 
fractions stratified by glioma subtype. I, initial; S, shared; R, recurrence.c, The 
proportion of glioma recurrences with alkylating agent-related 
hypermutation, grouped by glioma subtype. Fisher’s exact test was used to 
compare proportions between subtypes. d, Kaplan-Meier curve depicting 
overall survival in hypermutant (red) versus non-hypermutant (blue) patients 
treated with alkylating agent among IDH-wild-type (left, n=99) and IDH- 
mutant-noncodel (right, n=32) tumours. Pvalues determined by log-rank test. 


sample-specific probabilities for both selection and neutrality while 
modelling sources of noise in sequencing data. The classification of a 
sampleas ‘selection’ or ‘neutral’ is determined by whichever model has 
the greater probability. Classification as neutral reflects the accumula- 
tion of random mutations that are not subject to selection. Given the 
stringent algorithm requirements, 183 patients were included in this 
analysis with at least one time point, and 104 patients with both time 
points (16 IDH-mutant-codels, 29 IDH-mutant-noncodels, 59 IDH-wild- 
type; Supplementary Table 3). Neutral-to-neutral was the most common 
evolutionary trajectory across all three subtypes (52%), and IDH-wild- 
type tumours displayed the highest observed selection at any time 
point, with selection detected in 64% of tumours (Fisher’s exact test 
P=0.01; Fig. 2e, Supplementary Table 3). IDH-wild-type gliomas with 
evidence for selection at recurrence hada shorter overall survival than 
IDH-wild-type gliomas classified as neutral at recurrence (P= 0.027; 
log-rank statistic, Fig. 2f), which suggests that subclonal competition 
associates with more aggressive tumour behaviour. To address the 
limitations of smaller sample sizes in the IDH-mutant subtypes, we 
performed a Cox proportional hazards model including age at first diag- 
nosis, all three glioma subtypes, and mode of selection at recurrence. 
This analysis revealed that selection at recurrence was significantly 


a b 


Initial Recurrence o 10 
oO 
7 Clonal 5 
oO Subclonal § Ss 
oO 1S) 8 Samples 
2 
< 100 
x 6 75 
: 25 
Ts 
Oo 4 
Oo ie} 
re 
2 ey 
3-8 
i Le oO 
Sila.. ill 
io) om 12) aa 2 6 8 10 
Clone Clone Cluster CCF rank in initial 
d Private to Shared in Shared in Private to 
c Sample type Variant fraction ict initial recurrence recurrence 
os = 0.52 Re = 0.93 R® = 0.99 
= ¢ Initial 150] 2-03 Pe = 0.005 80015 0.58 fs 
¢ Initial 4 Shared goo]? = 68 x14 t00|7= 88 <1¥ 600} = 18x 19M oD 
& 2.07 ¢ Recurrence ¢ Recurrence . 400: Qs 
2 7 € 3. Selected Selected a Neutral & 
4 3 > Re=4 Fe = ooo ae =1 
£16 E a 2,000) 5 “0.99 qs] 2 = 2018 3,000]? = 0.77 25 
(9) © § 1,500jn=3.7%1 n=2.3x 19g wy n=58x1 Sx 
s = = 1,000 500: 2,000: gi 
Z 124 ge 250 1,000 a 
a 12 { 4 ¢ 22 500 Neutral Selected Selected Neutral @ 
3 ¢ ‘ J i 4 EE 0 0 
ze] T T T 6,000 A? = 0.99 FP = 09 BP = 0,92 42,600 Rea 4 
rs} l 4 o° P=063 | 8,000) 5 0.001 8,000) 8 0.003, 9.000]” = 0-92 rst 
6 0.84 4,000} = 1.01 4,000: na2an1g 4,000)” = 24 * 193 ee n=24x14 g 
: T T T T : D 
IDH-mut IDH-mut IDH-WT = IDH-mut IDH-mut IDH-WT 2,000 2,000: 2,000 ano 5 
codel  noncedel Godal’: “noncodsl 0 Neutral a Selected a Selected Neutral 
1/0.25 10.1 1/0.25 1/0.1 1/0.25 1/01 110 25 1/0.1 
Inverse allelic frequency (1/f) 
© IDH-wT © IDH-mut-noncodel f IDH-WT 
IDH-mut-codel Treatment unknown 
1.00 + Neutral at recurrence (n = 44) 
; 2 ++ Selection at recurrence (n = 39) 
Selection ~ yy | Selection 3 0.75 
IDH-WT 4 a 
h Yes a 
\ ) § 0.50 
ft / =. 
IDH-mut Neutral = 
noncodel (oa Gnknawn Neutral g 0.25 
a 
TDH-mut “eas —_ 7 
codel Ne 0.00 
0 50 100 150 


f 1 1 1 1 1 f 1 
Glioma subtype Initial tumour Treatment Recurrent tumour 


Fig. 2| Quantifying selective pressures during glioma evolution. 

a, Schematic depiction of CCF values during tumour evolution indicating 
clonality and associated relative timing. b, Comparison of PyClone clusters 
ranked by CCF in matched initial and recurrent tumours. c, Left, dN/dS ratio for 
all variants (thatis, global) in initial and recurrent tumours for each subtype. 
Hypermutators were not included (n=187). Dots represent the global dN/dS 
ratio with associated Wald confidence intervals. Right, global dN/dS ratios for 
variant fractions per subtype. d, Cumulative distribution of subclonal 
mutations by their inverse variant allele frequency. Mutations were separated 


associated with shorter survival across subtypes (Hazard ratio = 1.53, 
95% confidence interval 1.00-2.41, P=0.048; Supplementary Table 4). 
We next investigated whether radiation and chemotherapy imposed 
aselective effect, by comparing the evolutionary status at recurrence 
with treatment and other clinical variables. We did not observe signifi- 
cant associations between subclonal selection and radiation therapy 
or chemotherapy (Fisher’s exact test P> 0.05; Supplementary Table 5), 
which suggests that standard therapeutic approaches for glioma have 
limited effect on the subclonal tumour architecture. Although high- 
depth sequencing datasets may be required to detect subtle selective 
effects”, our analyses raise the possibility that the survival benefit 
derived from standard chemoradiation results from the elimination 
of tumour cells in which treatment sensitivity of individual cells is not 
determined by genetic factors. 


Driver alteration frequencies across time 


We evaluated how stability, acquisition and the loss of mutation and 
copy number drivers’ over time affect glioma evolution. We used the 
dN/dS ratio to nominate 12 candidate mutation driver genes at both 
time points (Q< 0.05, Fig. 3a, Extended Data Fig. 7a) and determined 
significant alterations in copy number that recapitulated previously 
identified drivers (Extended Data Fig. 7b). Mutations in /DH1 and 


Time (months) 


by time point, variant fraction and glioma subtype. Deviation froma linear 
relationship, significant Kolmogorov-Smirnov P values and Pearson’s R? values 
below 0.98 indicate selection. e, Sankey plot indicating the breakdown of 
SubClonalSelection evolutionary modes by subtype and therapy (n=104). The 
sizes of the bands reflect sample sizes and band colours highlight the glioma 
subtype. Grey colouring reflects instances when treatment information was 
not available. f, Kaplan-Meier curve showing survival differences between IDH- 
wild-type recurrent tumours demonstrating selection (n=39) compared with 
neutrally evolving tumours (n=44). Pvalue determined by log-rank test. 


co-occurring loss of the 1p/19q chromosome arms have been suggested 
as glioma-initiating events’, which was corroborated by the observation 
that these events were not lost or acquired during the surgical interval 
(Fig. 3a, Extended Data Fig. 8a). Similarly, we observed that mutations in 
the TERT promoter were almost always shared in the IDH-mutant-codel 
and IDH-wild-type samples, although many samples lacked sufficient 
coverage in this GC-rich region. Chromosome 7 gains and chromo- 
some 10 losses were present ina large majority of IDH-wild-type initial 
tumours and persisted into recurrence. 

Shifts in the fraction of cancer cells containing an event may also 
indicate atime dependency of drivers. We determined changes in cel- 
lular prevalence of shared driver events by ordering events in each 
sample by their CCF value (Extended Data Fig. 9). ATRX mutations in 
IDH-mutant-noncodel initial tumours demonstrated lower CCFs than 
TP53 (P= 0.03) and /DH1 (P=0.10) mutations, suggesting that /DH1 and 
TP53 mutations precede A7RX inactivation’. There was no difference 
in CCF values between /DH1 and TP53 among initial gliomas (P=0.98); 
however, /DH1 mutations demonstrated significantly lower CCF val- 
ues than 7P53 mutations (P= 0.0018) in recurrent gliomas. We did 
not observe any CCF differences among driver mutations detected in 
IDH-wild-type tumours at either time point. Chromosome 10 deletion 
CCFs were higher than chromosome 7 amplifications (P= 0.0036), 
which indicates that chromosome 10 deletions arise earlier®. Similarly, 
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Fig. 3| Patterns of glioma driver frequencies over time. a, Driver dynamics for 
single-nucleotide variants (SNVs) nominated by the dN/dS ratios and copy 
number alterations (CNVs) nominated by GISTIC (n=222). Each column 
represents a single patient at two separate time points stratified by subtype 
and ordered left-to-right by the number of driver alterations. The degree of 
aneuploidy difference (recurrence — initial) offers asummary metric for 
increases (>0) or decreases (<O) in aneuploidy at recurrence. Variants are 
marked and different shapes indicate whether a variant was shared or private. 
The variant type is depicted by its colour. Stacked bar plots accompanying each 
gene/arm provide cohort-level proportions for whether the alteration was 


there was no difference in CCF values between CDKN2A deletion and 
EGFR amplification (P= 0.70). EGFR and chromosomal arm events sig- 
nificantly differed (that is, 1Op del versus EGFR amp, P= 0.0019) but 
not CDKN2A deletion and chromosomal events (that is, 10p del ver- 
sus CDKN2A del, P= 0.33). The consistently high CCF values for EGFR 
amplifications could indicate that these events precede even some 
larger chromosomal aberrations, while not excluding the possibility 
that high levels of extrachromosomal EGFR™ artificially inflate CCF. 
Longitudinal changes in CCF values provide additional insights into 
evolutionary dynamics. For instance, the CCF value may increase whena 
driver event is linked to clonal expansion, or conversely, decrease when 
aclone is outcompeted. Most individual drivers did not demonstrate 
significant consistent CCF changes between the initial tumour and 
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shared, lost or acquired. Rec, recurrence; evo, evolution. b, Aneuploidy 
comparison in matching initial and recurrent IDH-mutant-noncodel tumours. 
c, Within-sample CCF comparison of CDKN2A homozygous deletion (homdel) 
to genome-wide CCF asa proxy for aneuploidy. A relative higher CCF indicates 
temporal precedence. Pvalue determined by Wilcoxon signed-rank test. 

d, Kaplan-Meier curve comparing survival in IDH-mutant-noncodel tumours 
with an alteration in the cell cycle, acquired aneuploidy, or both (shades of red) 
versus unaltered IDH-mutant-noncodel tumours (blue). Pvalue determined by 
log-rank test. 


recurrence (Extended Data Fig. 10a). A notable exception was the 7P53 
mutation CCF that increased over time (P= 0.037) inIDH-mutant-non- 
codels, but not IDH-wild-type gliomas (P= 0.13, Extended Data Fig. 10b). 
We did not observe any differences in /DH1 CCF over time among IDH- 
mutant-noncodel tumours, possibly because the general trend of these 
tumours to increase in CCF is counteracted by the biological loss of 
relevance of mutant /DH1 over time (Extended Data Fig. 10c). Indeed, 
a gross comparison of all shared mutation CCFs revealed an increase 
in recurrent IDH-mutant-noncodel tumours (P< 0.0001), which may 
reflect increased clonality and a reduction in intratumoral heteroge- 
neity (Extended Data Fig. 10d). By contrast, shared CCFs decreased in 
IDH-wild-type tumours, potentially indicating a general increase in 
intratumoral heterogeneity at recurrence (P< 0.0001, Extended Data 
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Fig. 4 | Neoantigen selection during tumour progression. a, Mean proportion 
of coding mutations giving rise to neoantigens (neoantigens/nonsynonymous 
mutations) stratified by glioma subtype and time point (n=222). Dataare 
mean +s.d.b, Box plot depicting the distribution of observed-to-expected 
neoantigen ratios in the GLASS cohort stratified by glioma subtype. Pvalue 
determined by ¢-test. Each box spans quartiles, with the lines representing the 
median ratio for each group. Whiskers represent absolute range, excluding 
outliers. c, Scatterplot depicting the association between the observed-to- 
expected neoantigen ratio ina patient’s initial versus recurrent tumours. Each 
point represents a single patient tumour pair. R denotes Pearson correlation 
coefficient. Panels b and c only include samples from pairs with at least three 


Fig. 10d). We confirmed that IDH-mutant-noncodel CCF increases and 
IDH-wild-type decreases were not biased by patients with high muta- 
tional burden through the classification of patient-specific shared 
mutation CCF change (Extended Data Fig. 10e). 

We next investigated whether specific somatic alterations were 
acquired or lost over time. Gene-specific enrichment of many recur- 
rence-only mutations was found in hypermutated tumours, but there 
was no enrichment for somatic gene alterations innon-hypermutators, 
which suggests that glioma recurrence is not directed by particular sets 
of mutations (Extended Data Fig. 8b). Within subtypes, we detected 
an enrichment in CDKN2A homozygous deletions (Fig. 3a, Extended 
Data Fig. 8a) in recurrent IDH-mutant-noncodels, which was corrobo- 
rated by additional alterations to cell cycle genes (focal gain of CCND2, 
CDK4 and CDK6, and mutation or homozygous loss of RB1). Mutations 
in cell cycle checkpoint control genes are associated with genomic 
instability®. Therefore, we analysed aneuploidy levels by determining 
the proportion of the genome that had undergone aneuploidy events 
(Extended Data Fig. 11a, b). We observed that IDH-mutant-noncodel 
tumours had a higher level of aneuploidy at recurrence (Wilcoxon rank 
sum test P=1.4 x 10 total aneuploidy, P= 8.6 x 10° arm-level ane- 
uploidy; Extended Data Fig. 11c, d) with tumours carrying acquired cell 
cycle gene alterations displaying the largest increases in aneuploidy 
(P=7.6 x 10°; Wilcoxon rank sum test, Fig. 3b). We reasoned that 
CDKN2A deletions may precede aneuploidy. Homozygous CDKN2A 
deletions had significantly higher CCFs than the average somatic copy 
number variation CCF across the genome (as a surrogate for aneu- 
ploidy-related copy number changes), suggesting that CDKN2A loss 
occurred before aneuploidy (Fig. 3c). These alterations may hasten 


neoantigens in the initial and recurrent tumours (n=131, 63 and 24 pairs for 
IDH-wild-type, IDH-mutant-noncodel, and IDH-mutant-codel, respectively). 
d, Ladder plot depicting the difference in observed-to-expected neoantigen 
ratio between atumour’s clonal and subclonal neoantigens. Each set of points 
connected by aline represents one tumour. Tumours are stratified by whether 
they werea patient’s initial or recurrent tumour. Lines are coloured by each 
patient’s glioma subtype. Panel d only includes samples from pairs with at least 
three clonal neoantigens and at least three subclonal neoantigens in boththe 
initial and recurrent tumours (n=35, 20 and 9 for IDH-WT, IDH-mutant- 
noncodel and IDH-mutant-codel, respectively). Pvalue determined by paired 
two-sided t-test. Colours in each panel represent the glioma subtype. 


disease progression as patients with either alterations in cell cycle genes 
or the largest increases in aneuploidy at recurrence demonstrated 
significantly shorter survival than patients without these alterations 
(log-rank test P< 0.0001, Fig. 3d). Together, the persistence of drivers 
over time and the paucity of consistent change indicate that therapy 
does not result in selection of specific sets of molecular changes. 


Immunoediting activity in glioma 

We next investigated how the immune microenvironment affects 
evolutionary trajectories. The immune system may prune tumour 
cells carrying immunogenic (neo-)antigens, resulting inthe selection 
of subclones capable of evading the immune response. Evidence of 
this immunoediting process has been shown in several cancer types, 
including glioma**’, and suggests active immunosurveillance that 
may be therapeutically exploited*®. We computationally predicted 
neoantigen-causing mutations”. As expected, the neoantigen load 
across the GLASS cohort was strongly correlated with exonic mutation 
burden (Spearman’s rho = 0.89), with 42% of nonsynonymous exonic 
mutations giving rise to neoantigens on average. This fraction did not 
significantly differ by glioma subtype or between initial and recurrent 
tumours (P> 0.05, Wilcoxon rank-sum test; Fig. 4a). The most common 
neoantigen arose from the clonal R132H mutation in /DH1 and was 
present in of 22 out of 88 IDH-mutant initial and recurrent tumours. 
Beyond mutations in /DH1, no mutations gave rise to a neoantigen 
found in more than three tumours at a given time point (Supplemen- 
tary Table 6). Across the dataset, neoantigens and non-immunogenic 
mutations exhibited similar changes in CCF values between initial and 
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recurrent tumours indicating a lack of neoantigen-specific selection 
processes over time (Extended Data Fig. 12a). 

We then examined the extent to which immunoediting occurred by 
comparing the observed neoantigen rate of each sample to an expected 
rate that was empirically derived from our dataset. The output of this 
approach is anormally distributed set of ratios centred at 1. Samples 
with an observed-to-expected neoantigen ratio less than 1 exhibit evi- 
dence of neoantigen depletion relative to the rest of the dataset, and 
thus are more likely to have been immunoedited. We found that none 
of the three glioma subtypes contained observed-to-expected ratios 
that significantly differed from1(P> 0.05, one sample t-test), although 
IDH-wild-type tumours exhibited significantly lower scores than IDH- 
mutant-noncodels (t-test, P= 0.04; Fig. 4b). We also did not observe an 
association between the observed-to-expected ratio and survival when 
adjusting for subtype and age (Wald test, P> 0.05), nor was there a dif- 
ference between samples with neutral evolution dynamics compared 
to those exhibiting evidence of subclonal selection. When compar- 
ing samples longitudinally, we found that the observed-to-expected 
neoantigen ratio was strongly correlated between initial and recurrent 
tumours of each patient (Pearson’s R= 0.73, P=5 x10 °8), which suggests 
that the neoantigen depletion level in the recurrence reflects that of 
the initial tumour (Fig. 4c). 

Immunoediting is most likely to take place in the tumours with high 
cytolytic activity and low levels of immunosuppressive activity”. Hyper- 
mutators, which have high loads of neoantigens, have previously been 
associated with highly cytolytic microenvironments*®®. However, we did 
not observe any differences in the observed-to-expected neoantigen 
ratio between hypermutated recurrent tumours and their initial coun- 
terparts, nor did we observe differences between hypermutated and 
non-hypermutated recurrent tumours, indicating that immunoedit- 
ing activity is not related to the total number of mutations inasample 
(Wilcoxon rank-sum test P > 0.05; Extended Data Fig. 12b). To more 
directly determine whether there were immunological factors asso- 
ciated with neoantigen depletion, we analysed CIBERSORT immune 
cell fractions froma subset of samples that had undergone expression 
profiling in a previous study (n = 84 from 42 tumour pairs)**™. Initial 
tumours with an observed-to-expected neoantigen ratio greater than 
lexhibited significantly higher levels of CD4* T cells than those witha 
ratio less than 1, whereas recurrent tumours with a ratio greater than1 
exhibited significantly higher levels of macrophages and neutrophils, 
and significantly lower levels of plasma cells relative to those with ratio 
less than 1(P< 0.05, Wilcoxon rank-sum test; Extended Data Fig. 12c). 

Although we did not detect many factors associated with the 
observed-to-expected neoantigen ratio, we did observe that the ratio 
was significantly associated with the total number of unique HLA loci 
ina patient (Spearman’s rho = 0.28, P=2 x 107°), reflecting similar find- 
ings in lung cancer®’. This may bias analyses comparing the ratio across 
patients. To determine whether immunoediting varies over time ina 
patient-agnostic manner, we compared the observed-to-expected neo- 
antigen ratio derived from the clonal mutations of asample, which likely 
arose earlier in tumour evolution, to that derived from their subclonal 
mutations, which arose later. We did not observe a significant difference 
inthe observed-to-expected neoantigen ratio of each patient’s clonal 
and subclonal neoantigens, regardless of glioma subtype or whether 
the sample was an initial tumour or recurrence (P> 0.05, paired t-test; 
Fig. 4d). Together, these analyses suggest that neoantigens in glioma 
are not exposed to differing levels of selective pressure throughout 
their development. 


Discussion 

We reconstructed the evolutionary trajectories of 222 patients with gli- 
omato help to understand treatment failures and tumour progression. 
The longitudinal molecular profiles revealed common features such 
as acquired hypermutation and aneuploidy, and also highlighted the 
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individualistic paths of glioma evolution after treatment. Our results 
provide evidence that the current standard of care therapies do not fre- 
quently coerce glioma down predictable paths. Instead, an unexpected 
number of gliomas appeared to evolve stochastically after early driver 
events. We expect that continuing to profile patient tumours over time 
using comprehensive sequencing approaches will identify other com- 
monevolutionary paths. Our results highlight the prospects of several 
ongoing efforts that may inform new glioma therapies. 

The observation that treatment-induced hypermutation occurred 
across subtypes, but did not confer a detrimental effect on patient sur- 
vival, leaves the clinical importance of glioma hypermutation uncer- 
tain?’ *?”, Future analyses that consider the number of therapy cycles 
and MGMT DNA methylation status will help to determine factors that 
predispose tumours to hypermutation and identify therapies that effec- 
tively exploit the vulnerabilities of this phenotype (for example, high 
mutational burden). Acquired cell cycle alterations and aneuploidy in 
recurrent IDH-mutant-noncodel gliomas also provide a rationale to 
target these more aggressive phenotypes with CDK inhibitors“ or with 
compounds that disrupt microtubule dynamics”. Finally, our analyses 
revealed that immunoediting activity does not vary in glioma over time, 
although we did observe variation between individual patients. Further 
molecular and immunological data are needed to fully understand the 
effect that this variability has on glioma evolution and to devise thera- 
pies directed at the glioma immune response”. To this end, we found 
that clonal neoantigens arising from the IDH1(R132H) mutation per- 
sisted from the initial tumour into the recurrence, justifying neoantigen 
vaccine approaches as treatments for initial and recurrent glioma‘. 

Collectively, these findings help shape our perspective on what 
constitutes an optimal treatment, and what approaches would result 
in the greatest removal or killing of glioma cells possible. Genomic 
characterization efforts such as The Cancer Genome Atlas (TCGA) 
have greatly increased our understanding of glioma biology but were 
limited to a single snapshot in evolutionary time. The GLASS resource 
provides a framework to study the patterns of glioma evolution and 
treatment response. 
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Methods 


Data reporting 

No statistical methods were used to predetermine sample size. The 
experiments were not randomized, and investigators were not blinded 
to allocation during experiments and outcome assessment. 


DNA sequencing and data collection 

The GLASS dataset consists of both unpublished and published 
sequencing data as outlined in Supplementary Table 1. Among the 
cohort were exomes from 436 glioma samples (200 patients), whole- 
genome data from 165 glioma samples (78 patients), with overlap- 
ping exome/whole-genome data on 78 glioma samples (38 patients). A 
matching germline sequence was available for all patients. The dataset 
includes 257 sets of at least two time-separated tumour samples, 17 
standalone recurrences, and 19 patients with at least two geographi- 
cally distinct tumour portions. More specifically, the dataset includes 
exome or whole-genome sequencing data on 211 primary gliomas, 234 
first recurrences, 32 second recurrences, 11 third recurrences and 1 
fourth recurrence (Supplementary Table 7). 

Newly generated whole-genome sequencing data for the Chinese 
University of Hong Kong (HK), Northern Sydney Cancer Centre (NS) 
and MD Anderson Cancer Center (MD) cohorts were subjected to 
150 base paired-end sequencing. The HK samples were sequenced 
using HiSeqX, whereas the NS and MD cohorts were sequenced using 
NovaSeq, according to Illumina’s protocols. Whole-exome capture 
was performed using the following platforms as reported in previous 
publications’?! 234°, 

The Agilent SureSelect Human All Exon 50 Mb capture kit was used for 
patients SF-OOO1-SF-0021, and the Agilent SureSelect Human All Exon 
V4 capture kit was used for patients SF-0024-SF-0029 in the University 
of California San Francisco cohort. The Agilent SureSelect Human All 
Exon v4 or VS kit was used to capture samples in the Kyoto University 
cohort. The Samsung Medical Center cohort reported using the Agilent 
SureSelect kit for patients SM-ROS6-SM-RO71, SM-RO75, SM-RO76 and 
SM-RO95-SM-RI114, whereas the IIlumina TruSeq Exome-capture kit 
was used for patient SM-RO72. Exome capture was performed using the 
Agilent SureSelect Human All Exon 50 Mbkitin the TCGA glioblastoma 
(GBM) cohort and the Agilent SureSelect Human All Exon v.2.0 44 Mb 
kit in the TCGA low grade glioma (LGG) cohort. Columbia University 
cases were captured using the Agilent V3 50 Mb kit, sequencing 90 bp 
paired-end reads for samples ROO9-TP, ROO9-RI1, RO11-TP, RO11-R1, 
RO14-TP, RO14-R1, ROI7-R1, RO18-R1 and RO19-R1. Mapping files of ini- 
tial tumour and normal samples of patients RO17-RO19 were obtained 
from the TCGA through the CG-hub. All other samples were captured 
using the Agilent SureSelect XT Human All Exon v.4 Kit, 80 million 
paired-end reads, 150x on-target coverage. Samples inthe Henry Ford 
Hospital cohort were multiplexed and sequenced using Illumina HiSeq 
2000 by the Sequencing and Microarray Facility at an average target 
exome coverage of 100 using 76-bp paired-end reads. Samples in 
the HK cohort were subjected to 75 base paired-end sequencing for 
HK-0001-HK-0004, as performed using NextSeq in high output mode. 
Inthe Leeds Cohort (LU), the SureSelectXT V5 kit (PE100) was used to 
construct exome libraries. The Illumina TruSeq Exome capture kit was 
used for samples at the Medical University of Vienna - Research Center 
for Molecular Medicine (CeMM). 


GLASS identifiers 

AGLASS barcode system was created, based on TCGA barcode design, 
inaneffort to de-identify patient information and provide an organized 
framework for the different pieces of the dataset. 

GLASS barcodes are composed of 24 characters. The first four 
characters specify the project (either GLSS or TCGA). All datasets 
submitted to The GLASS Consortium, published and unpublished, 
were given the GLSS project ID. Samples that were part of the TCGA 


cohorts (TCGA-GBM and TCGA-LGG) were given a TCGA designation. 
The next two characters designate the centre where the samples were 
either acquired or sequenced (Supplementary Table 7). This is followed 
by the four-character centre-specific patient identification that was 
kept as close as possible to the patient identification provided by the 
collaborators to allowa simplified trace-back process. Patient data are 
divided by a relative sample type, suchas initial tumour (TP), recurrent 
tumour (R1), normal tissue (NB or NM, for example), or metastatic 
tumour sample (M1). If there was more than one recurrence the rela- 
tive number was specified following ‘R’. Some patients had surgeries 
for which a biospecimen was unavailable. Thus, asurgical number was 
also provided to indicate temporal ordering (Supplementary Table 8). 
To include spatially separated samples the portion designation was 
added, which is followed by one character specifying the type of ana- 
lyte, either DNA (D) or RNA (R). As there is variation in the sequencing 
analysis, athree-character designation represents either whole-genome 
sequencing (WGS) or whole-exome sequencing (WXS). The last part 
of the GLASS barcode is a six-character designation unique to each 
barcode that was randomly generated. 


Computational pipelines 

All pipelines were developed using snakemake 5.2.2°°. Unless otherwise 
stated, all tools mentioned are part of the GATK 4 suite™. All data were 
collected at a central location (The Jackson Laboratory) and analysed 
using homogenous pipelines capable of processing raw fastq files as 
well as re-processing previously analysed bam files. 


Alignment and pre-processing 

Data pre-processing was conducted in accordance to the GATK Best 
Practices using GATK 4.0.10.1. In brief, aligned BAM files were separated 
by read group, sanitized and stripped of alignments and attributes using 
‘RevertSam’, giving one unaligned BAM (uBAM) file per readgroup. 
Uniform readgroups were assigned to uBAM files using ‘AddOrReplac- 
eReadgroups’. Similarly, unaligned fastq files were assigned uniformly 
designated readgroup attributes and converted to uBAM format using 
‘FastqToSam’. uBAM files underwent quality control using ‘FastQC 
0.11.7’. Sequencing adapters were marked using ‘MarkIlluminaAdapt- 
ers’. UBAM files were finally reverted to interleaved fastq format using 
‘SamToFastq’, aligned to the b37 genome (human_g1k_v37_decoy) using 
‘BWA MEM 0.7.17’, attributes were restored using ‘MergeBamAlignment’. 
‘MarkDuplicates’ was then used to merge aligned BAM files from multi- 
ple readgroups and to mark PCR and optical duplicates across identical 
sequencing libraries. Lastly, base recalibration was performed using 
‘BaseRecalibrator’ followed by ‘ApplyBQSR’. Coverage statistics were 
gathered using ‘CollectWgsMetrics’. Alignment quality control was 
performed running ‘ValidateSamFile’ on the final BAM file and quality 
control results were inspected using ‘MultiQC 1.6a0*°. A haplotype 
database for fingerprinting was generated using a modified version of 
the code on https://github.com/naumanjaved/fingerprint_maps. The 
tool ‘CrosscheckFingerprints’ was used to confirm that all readgroups 
within a sample belong to the same individual, and that all samples 
from one individual match. Any mismatches were marked and excluded 
from further analysis. 


Variant detection 

Variant detection was performed in accordance to the GATK Best prac- 
tices using GATK 4.1.0.0. Germline variants were called from control 
samples using Mutect2 in artefact detection mode and pooled intoa 
cohort-wide panel of normals. Somatic variants were subsequently 
called in individual tumour samples (single-sample mode) and in entire 
patients using GATK 4.1 Mutect2 in multi-sample mode. Mutect2 was 
given matched control samples, the aforementioned panel of nor- 
mals and the gnomAD germline resource as additional controls. Cross- 
sample contamination was evaluated using ‘GetPileupSummaries’ and 
‘CalculateContamination’ run for both tumour and matching control 
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samples. Read orientation artefacts were evaluated using ‘Collect- 
F1R2Counts’ and ‘LearnReadOrientationModel’. Somatic likelihood, 
read orientation, sequence context, germline and contamination filters 
were applied using ‘FilterMutectCalls’. 


Variant post-processing 

BCFTools 1.9 was used to normalize, sort and index variants®. A con- 
sensus VCF was generated from all variants in the cohort, remov- 
ing any duplicate variants. The consensus VCF file was annotated 
using GATK 4.1 Funcotator and the v1.6.20190124s annotation data 
source. Allele frequencies from multi-sample Mutect2 were used 
to compare allele frequencies between related samples. Multi- 
sample Mutect2 calls and filters mutations across a patient as a 
whole and does not determine mutation calls in a single sample. 
Single-sample mutation calls were overlaid on the multi-sample 
calls to infer whether variants were called in individual samples. 
Single-sample called variants that were not present in the multi- 
sample callset were discarded. 


Mutational burden 

Mutational burden was calculated as the number of mutations per Mb 
sequenced. A minimum coverage threshold of 15x was required for 
each base. DNA hypermutation was defined for recurrent tumours 
with greater than 10 mutations per Mb sequenced as these values were 
considered outliers (1.5 times the interquartile range above the upper 
quartile). Notably, there were a few initial gliomas that demonstrated a 
mutational frequency above 10 mutations per Mb. However, the ‘hyper- 
mutation’ classification was restricted to only patients with this level 
at recurrence since these likely reflect different evolutionary paths. 


Mutational signatures 

The relative contributions of the COSMIC mutational signatures 
were determined from a patient’s initial-only, recurrence-only, and 
shared mutations by solving the non-negative-least squares prob- 
lem for each set of mutations using the 30 signatures from version 
2 (March 2015). Six signatures were dominantly enriched in at least 
3% of the fractions and we resolved the non-negative-least squares 
problems using the reduced six-signature model to increase accuracy 
and reduce noise. 


Copy number segmentation 

Copy number identification was performed according to the GATK Best 
Practices and is outlined briefly here. The pipeline differs slightly for 
whole genomes and whole exomes. For whole genomes, the genome 
was segmented into 10kb bins using ‘PreprocessIntervals’. For exomes, 
overlapping regions between several commonly used capture kits 
(Broad Human Exome b37, Nextera Rapid Capture, TruSeq Exome, 
SeqCap EZ Exome V3, Agilent SureSelect V4, Agilent SureSelect V7) 
were identified using ‘bedtools multilntersectBed’. The tool ‘Preproc- 
essIntervals’ was used to apply 1-kb padding and to merge overlapping 
intervals. In parallel, ‘SelectVariants’ was used to subset the gnomAD 
resource of germline variants to variants with a population allele fre- 
quency greater than 5%. Next, ‘CollectReadcounts’ was used to count 
reads in the bins generated by ‘PreprocessIntervals’ separately for 
autosomes and allosomes. In parallel, ‘CollectAllelicCounts’ was used 
to count reference and alternate reads at gnomAD variant sites with 
a population allele frequency greater than 5%. The cohort was sub- 
sequently split into batches determined by sequencing centre and 
‘CreateReadCountPanelOfNormals’ was used to create a panel of 
normal for each batch. Panel of normals were created separately for 
allosomes and autosomes, and allosomes were separated further by 
sex. To improve the panel of normals further, GC content annotation 
of each interval as determined by ‘Annotatelntervals’ were given. Next, 
‘DenoiseReadCounts’ was used to denoise the binned readcounts out- 
put by ‘CollectReadCounts’, given a panel of normal determined by 


batch, chromosomes (allosomes or autosomes) and sex. Denoised copy 
ratios were plotted and inspected for quality concerns using ‘PlotDe- 
noisedCopyRatios’. The tool ‘ModelSegments’ is an implementation 
of a gaussian-kernel binary-segmentation algorithm and was used to 
merge contiguous segments and assign copy and allelic ratios. The 
results of this segmentation were plotted using ‘PlotModelledSeg- 
ments’ and inspected for quality concerns. 


Copy number calling 

Acopy number caller loosely based on GATK ‘CallCopyRatioSegments’ 
(which in turnis based off of ReCapSeg) and GISTIC was implemented 
to call both arm-level and high-level copy number changes, respec- 
tively°”*’, 

Segments (from ‘ModelSegments’) with a non-log, copy ratio 
between 0.9 and 1.1 were determined to be neutral. These segments 
were then weighted by length and a weighted mean and standard devia- 
tionnon-log, copy ratio (once-filtered) were determined again. Outlier 
segments are removed and once again a weighted mean and standard 
deviation non-log, copy ratio (twice-filtered) were determined. Seg- 
ments with a non-log, copy ratio between 0.9 and 1.1 and segments 
within two standard deviations of the twice-filtered mean were deter- 
mined to be neutral, and segments outside of these boundaries were 
determined to have a low-level amplification or deletion, depending 
onthe direction. 

The weighted mean and standard deviation of the non-log, copy 
ratio (once-filtered) was then determined individually for each chro- 
mosomearm. Outlier segments were removed and the weighted mean 
and standard deviation of the non-log, copy ratio (twice-filtered) was 
determined again. To determine a high-level amplification and deletion 
threshold, the most highly amplified and deleted chromosome arms 
were selected, respectively. The twice-filtered mean plus (high level 
amplification) or minus (high level deletion) two times the standard 
deviation of the selected arms were used as high-level thresholds. 

Gene level copy numbers were called by intersecting the gene bound- 
aries with the segment intervals and by calculating the weighted non- 
log, copy ratio for that gene. The copy number call for that gene was 
then determined by comparing the gene-level non-log, copy ratio to 
the previously determined thresholds. 


dNdScv 

The dN/dS ratios were estimated using the R package dNdScv (https:// 
github.com/im3sanger/dndscv) was run using the default and rec- 
ommended parameters for all mutations in initial tumour samples, 
recurrent tumour samples, and for each mutational fraction (unique 
to initial, unique to recurrent and shared). All analyses were conducted 
separately within the three main tumour subtypes. 


Aneuploidy calculation 

The most reductive metric of aneuploidy was computed by taking the 
size of all non-neutral segments divided by the size of all segments. The 
resulting aneuploidy value indicates the proportion of the segmented 
genome that is non-diploid. 

In parallel, an arm-level aneuploidy score modelled after a previously 
described method was computed”. In brief, adjacent segments with 
identical arm-level calls (—1, 0 or 1) were merged into a single segment 
with a single call. For each merged/reduced segment, the proportion 
of the chromosome arm it spans was calculated. Segments spanning 
greater than 80% of the arm length resulted ina call of 1 (loss), O (neu- 
tral) or +1 (gain) to the entire arm, or ‘NA’ if no contiguous segment 
spannedat least 80% of the arm’s length. For each sample the number 
of arms with a non-neutral event was finally counted. The resulting 
aneuploidy score is a positive integer with a minimum value of 0 (no 
chromosomal arm-level events detected) and a maximum value of 39 
(total number of autosomal chromosome arms excluding the short 
arms for chromosomes 13, 14, 15, 21 and 22). 


Estimates of evolutionary pressures 

Evolutionary pressures were evaluated both by variant status and 
glioma subtype using the neutralitytestr algorithm as previously 
described (R package: neutralitytestr v.0.0.2, https://github.com/ 
marcjwilliams1/neutralitytestr)*. Individual variant allele frequency 
vectors were merged at the level of glioma subtype by variant sta- 
tus. Only mutations found in copy-neutral regions were included in 
these analyses. For all else, default parameters were used. Merged 
variant allele frequency distributions were deemed to be selected 
when the neutral null hypothesis was rejected using several met- 
rics. Tests for neutrality required that both R? < 0.98 and the area 
between the two curves of (1) merged variant allele frequency data 
and (2) anormalized distribution expected under neutrality to be 
significantly different. 

The SubclonalSelection algorithm was applied to GLASS mutation 
data to measure the selection strength in individual tumour samples 
(Julia package: SubclonalSelection, https://github.com/marcjwilliams1/ 
SubClonalSelection.jl)’®. Patients that had samples at both time points 
witha TITAN-defined purity estimate > 0.5 and > 25 subclonal mutations 
in diploid regions were included. Mean coverage across all mutations 
was used as the ‘read_depth’ input parameter and the model was run 
with the recommended 10° iterations and 1,000 particles. Samples 
were classified as neutral or selected based on the model that had the 
highest probability, in line with the prior applications to TCGA data’®. 
Classification based on the highest model probability yielded stable 
results as there was not a significant change in proportions when set- 
ting a higher classification probability threshold (P> 0.05, Pearson’s 
chi-square test, for both probability thresholds of 0.6 and 0.7). At all 
three probability thresholds (0.5, 0.6 and 0.7), Kaplan-Meier survival 
analyses between selection at recurrence and overall survival contin- 
ued to indicate that patients with IDH-wild-type tumours that were 
selected had a worse overall survival (P=0.03 (n=81), P=0.01 (n= 66) 
and P=0.01 (n=56), respectively). 


Mutation clonality 

Each patient’s clonal architecture was inferred using PyClone (v.0.13.1) 
by grouping SNVs into clonal clusters (https://github.com/aroth85/ 
pyclone)°°. The patient-level input mutation matrix was reduced by 
limiting to sites with at least 30x coverage across all samples. PyClone 
was subsequently run using a binomial density model, connected ini- 
tiation, and 10,000 iterations. Sample purities were provided for each 
patient and parental copy number (minor and major allele counts) from 
TITAN were given. PyClone results were post-processed using a burn-in 
of 1,000, thin of 1, minimum cluster size of 2and a maximum number 
of clusters per patient of 12. Individual mutations were determined to 
be clonal if the PyClone CCF values were = 0.5, subclonal for mutations 
with CCF > 0.1 and CCF < 0.5, mutations were considered non-clonal 
when CCF < 0.1, as previously described”. 


CNV clonality 

Allele-specific copy number, tumour purity and ploidy estimates 
were derived using a probabilistic model (TITAN, v.1.19.1) for both 
whole-genome and whole-exome sequencing samples”. TITAN was 
supplied with the tumour denoised read counts output by GATK 
DenoiseReadCounts and the tumour allelic counts at loci found to 
be heterozygous in control samples output by ModelSegments. An 
‘alphaK’ (and ‘alphaKHigh’) parameter of 2,500 and 10,000 was used 
for exomes and genomes, respectively. The patient sex was provided 
to improve fitting allosomes. For each tumour-control pair, TITAN 
was run assuming an initial ploidy of two or three, and assuming one to 
three clusters, resulting in a total of six possible solutions per tumour/ 
control pair. To select the optimal solution, TITAN’s internal select- 
Solution function was used witha threshold of 0.15 giving additional 
weight to diploid solutions. 


Timing analysis 

The CCF values output by TITAN or PyClone were used for separately 
timing copy number changes or mutations. To time specific copy num- 
ber changes in genes, the average CCF for that gene was calculated. 
When timing mutations in genes, the highest CCF amongst the non- 
synonymous mutations was taken. 


Neoantigen analyses 
Neoantigens in this analysis were defined as all 8-11-mer peptides that 
arose from an exonic nonsynonymous SNV or indel and bound their 
respective patient’s HLA class I molecules at a binding affinity score 
(half-maximal inhibitory concentration, IC,.) that was < 500 nM and 
better than or equal to the wild-type form of the peptide. Each patient’s 
four-digit HLA class I types were inferred using OptiType (v.1.3.1, https:// 
github.com/FRED-2/OptiType) run on each patient’s matched normal 
sample®. VCF files for each tumour sample were annotated using Vari- 
ant Effect Predictor (ensembl) with the ‘downstream’ and ‘wildtype’ 
plugins. Neoantigens from these VCFs were then called using pVACseq 
(v.4.0.10, https://github.com/griffithlab/pVAC-Seq)" run using netM- 
HCpan (v.2.8, http://www.cbs.dtu.dk/services/NetMHCpan-2.8/)™. For 
each pVACseq run, epitope length was set to 8, 9, 10 or 11, minimum 
binding affinity fold change was set to 1, and downstream sequence 
length was set to full, with default parameters used for all other settings. 
Downstream neoantigen analyses were performed using the pVACseq 
output linked to its respective mutation information. Neoantigen-causing 
mutations were defined as all mutations that gave rise to at least one neo- 
antigen. The observed-to-expected neoantigen ratio was calculated using 
apreviously developed approach that compares each tumour’s observed 
neoantigen rate to an empirically derived expected rate that assumes no 
selection against neoantigen-causing mutations”: From the gold set sam- 
ples inthe GLASS cohort (n=222), define N,to be the expected number of 
nonsynonymous missense SNVs per synonymous SNV with trinucleotide 
context s. B, is then defined as the expected number of neoantigen-gen- 
erating missense SNVs per nonsynonymous missense SNV with trinucleo- 
tide context s. For a given sample i, define Y; as the sample’s set of 
synonymous SNVs and s(m) to be a synonymous SNV with trinucleotide 
context m. The expected number of nonsynonymous missense SNVS, Nye, 
and neoantigen-causing mutations, B,,.4,can then be calculated as follows: 


Npred,i = >: Nom) 


mey; 
Bored,i = 2 Ngm)Bst) 
mey; 


To obtain the final neoantigen depletion ratio, R;, of sample i, the 
observed number of neoantigen-causing mutations in the sample, 
Bobs, iS divided by the sample’s observed number of nonsynonymous 
missense SNVS, Nops,j, and then this ratio is divided by the ratio Of Byyeq 
and Nyveq, Thus: 


R= Bobs,i/Nobs,i 
: Bored,i/Npred,i 

For analyses examining clonal/subclonal neoantigen ratios, the 
observed and expected numbers were calculated by subsetting the 
SNVs of asample by the respective criteria and then recalculating the 
ratio as described above. To mitigate overfitting, all analyses presented 
here used samples from patients with at least three neoantigen-causing 
mutations in their primary and recurrent tumours. 


Immune cell analyses 
CIBERSORT relative immune cell fraction data used in downstream 
neoantigen analyses were downloaded froma previous publication’. 
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Statistical methods 

All data analyses were conducted in R 3.4.2, Python 2.7.15, PostgreSQL 
10.5, and Julia 0.7. All survival analyses including Kaplan-Meier plots 
and Cox proportional hazards models were conducted using the R 
packages survival and survminer. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


All de-identified, non-protected access somatic variant profiles and 
clinical data are accessible via Synapse (http://synapse.org/glass). Raw 
data of the various sequencing datasets can be obtained in the Sup- 
plementary Information. 


Code availability 


All custom scripts and pipelines are available on the project’s github 
page (https://github.com/TheJacksonLaboratory/GLASS). 
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a Quality Control (QC) filters 


GLASS resource (n = 288 patients; 874 biological 
samples; 243 WGS & 799 WXS aliquots) 


. Exclude samples with lower sequencing depth 
. Removed one patient with incorrect germline-tumor mismatch 
. Excluded any single time point recurrences 
Total possible pairs of tumors from the same patient: 
371 longitudinal pairs (n = 257 patients) 
23 multi-sector pairs (n = 6 patients) 
. Select highest quality sample when both WGS and WXS or multisector 
data are available 


. When multiple longitudinal timepoints are available, select two highest 


a j j =e quality time points that maximizes measurement of disease 
Longitudinal mutation data available (n = 257 initial and 


recurrent tumor pairs; “silver set”) 


. Exclude samples based on manual review of copy number profiles to 
identify poorly segmented genomes 
. Retain subjects with high-quality copy number data at both time points 


Mutation and copy number data presented in manuscript 
(nN = 222 patients; “gold set”) 


b Overlapping datasets 
Mutation and copy number data presented in manuscript 
(nN = 222 patients; “gold set’) 
DNA methylation (n RNA sequencing (n 
= 102 samples, 51 = 84 samples, 42 
pairs pairs) 
‘ Previously published DNA . Previously published RNA-sequencing data 
methylation data (Wang et al. Cancer Cell 2017) 
“ DNA methylation was used to : RNA sequencing data was used for 
determine MGMT status and CIBERSORT analysis 


classify samples according to 
methylation subtypes (Capper et 
al. Nature 2018) 


Extended Data Fig. 1| Sample selection. a, Quality control workflow steps identifying all GLASS samples available as a resource and the identification of the 
highest quality set of patient pairs (n= 222) used for the presented mutational and copy number analyses. b, Additional available datasets. 
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initialand matched recurrent samples across three subtypes. Wilcoxonsigned- and fraction. Pvalues were determined by the linear model and adjusted by 
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presented. 
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indicates whether the mutations resulted in coding changes. b, Kaplan-Meier 
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cluster. b, Comparison of PyClone clusters ranked by CCFinmatchedinitialand areavailable inthe GLASS resource, but only two time-separated samples were 
recurrent tumours, asin Fig. 2b, but separated by subtype. c, d, Examples of used throughout to ensure clarity. 
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Extended Data Fig. 8 | Driver acquisition over time. a, Tabulated numbers of 
SNV (top) and CNV (bottom) driver events that were shared, initial-only or 
recurrence-only. Pvalues were determined by atwo-sided Fisher test 
comparing the initial-only fraction to the recurrence-only fraction testing for 
acquisition. b, One-sided Fisher test comparing the initial-only fraction to the 


2 
—log10(adjusted p-value) 
recurrence-only fraction among previously implicated glioma drivers testing 
for driver acquisition. Pvalues were adjusted for multiple testing using the false 
discovery rate (x axis). Hypermutators (red) and non-hypermutators (black) 
were separately analysed. 
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Extended Data Fig. 10 | Between time point intra-patient CCF comparison. 
a, Driver gene CCF comparison between initial and matched recurrences. Lines 
are coloured by variant classification. Pvalues determined by two-sided 
Wilcoxon rank-sum test. b, 7P53 CCF by subtype, otherwise as ina.c, /DH1 CCF 
by subtype, otherwise asina.d, Ladder plot visualizing change in CCF across all 
SNVs between initial and recurrent tumours, separated by subtype. Pvalues 


Subtype 


determined by Wilcoxon rank-sum test. e, Initial and recurrent mutations in 
each patient were compared using a Wilcoxon rank-sum test. Bar plot with 
counts of patients in each subtype are shown. Patients lacking significant 
change are shown in yellow, and those witha significant increase or decrease 
are shown in dark and light blue, respectively. 
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Extended Data Fig. 11| Aneuploidy calculation. a, Heat map displaying the 
chromosomal arm-level events (x axis) with patients represented in eachrow. 
Patients are placed in the same order for both the initial (left) and recurrence 
(right). White space was inserted as a break between the three subtypes. 
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Extended Data Fig. 12| Neoantigen evolution and cellular analysis. a, Bar 
plots representing the number of shared mutations that give rise to 
neoantigens (top row, ‘immunogenic’) and those that do not give rise to 
neoantigens (bottom row, ‘non-immunogenic’) stratified by longitudinal 
clonality (clonality in initial) — (clonality in recurrence)’) and further 
separated by subtype. The percentage of longitudinal clonality per subtype 
and mutation is shown. b, Left, ladder plot depicting the differencein 
observed-to-expected neoantigen ratio between the initial and recurrent 
tumours of patients with hypermutated tumours at recurrence. Each set of 
points connected bya line represents one tumour (n=70). Right, box plot 
depicting the distribution of observed-to-expected neoantigen ratiosin 
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recurrent tumours stratified by hypermutator status (n= 35 and 183 for 
hypermutators and non-hypermutators, respectively). Each box spans 
quartiles, with the lines representing the median ratio for each group. Whiskers 
represent absolute range, excluding outliers. Pvalues were determined bya 
paired and anunpaired two-sided ¢-test, for left and right graphs, respectively. 
c, Stacked bar plots depicting the average relative fraction of 11 CIBERSORT cell 
types in the neoantigen depleted (<1) and non-depleted (>1) initial and 
recurrent tumour subgroups. Pvalues to the right of each plot indicate a 
significant difference between the depleted and non-depleted groups for the 
noted cell type at that time. 
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For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
tt AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 
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Policy information about availability of computer code 


Data collection No software was used for data collection. 


Data analysis MultiQC version: 1.6a0 (quality assessment) 
FastQC 0.11.7 (quality assessment) 
BWA MEM 0.7.17 (alignment) 
R 3.4.2 (general data analyses) 
Python 2.7.15 (general data analysis) 
Julia 0.7 (general data analysis) 
PostgreSQL 10.5 (data management) 
BCFTools 1.9 (normalize, sort and index variants) 
snakemake 5.2.2 (pipeline development) 
GATK (including Mutect2) version: 4.1.0.0 (SNV/CNV detection) 
reebayes version: 1.2.0 (variant filtering) 
vcf2maf version: 1.6.16 (variant filtering and annotation) 
utationalPatterns version: 1.6.1 (mutational signatures) 
TITAN version: 1.19.1 (purity, ploidy, CNV clonality estimates) 
dndscv (R package) version: 0.0.1.0 (selection strength, nominate driver genes) 
alluvial (R package) version: 0.1-2 (visualize longitudinal neutrality) 
DBI (R package) version: 1.0.0 (database management) 
idyverse (R package) version: 1.2.1 (data analysis and visualization) 
survival (R package) version: 2.42-6 (survival analyses) 
neutralitytestr version: 0.0.2 (subtype-level, variant-level selection) 
SubClonalSelection version: 0.0.0 (sample-level selection) 
PyClone version: 0.13.1 (mutational clusters) 
For manuscripts utilizing custorOplydyfrensens foftwh/2. that bAe clase alypoeisje research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deppaitiga dnversiomunioy Tenofean tg F bred lcbemine Nature Research guidelines for submitting code & software for further information. 
netMHCpan version: 2.8 (neoantigen prediction) 
All other custom scripts and pipelines are available on the project’s github page (https://github.com/TheJacksonLaboratory/GLASS) 
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- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


All deidentified, non-protected access somatic variant profiles and clinical data are accessible via Synapse (http://synapse.org/glass). A subset of whole genome and 
whole exome sequencing data has been deposited in the National Center for Biotechnology Information's Sequencing Read Archive and/or the European Genome/ 
Phenome Archive (EGA). Please see Supplementary Table 1 for availability and accession codes. 
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Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No statistical methods were used to predetermine sample size. Sample size was a function of availability. 


Data exclusions We defined a quality control process to integrate whole exome and whole genome sequencing data collected from multiple cohorts. As 
shown in Extended Data Fig. 1, two datasets, Silver and Gold, were constructed to be used for each major analysis type, SNV and CNV, 
respectively. The two criteria used are intended to provide quality classifications for samples across fingerprinting, coverage, copy number 
variation (CNV) data and clinical annotation.Fingerprinting was performed using CrosscheckFingerprints (Picard), the purpose of this is to 
check that all of the input files (readgroups, libraries, samples, files) belong to the same patient, to remove duplicated cases, unmatched 
samples, and samples of poor quality. Any evidence of mismatch rendered the samples “blocked”, otherwise the sample was annotated as 
“allow”. To ensure suitable coverage for mutation calling, samples with near O mutation frequency as well as those 2 standard deviations 
below the mean for either WGS or WXS were annotated as “block”. Samples were categorized as “allow”, “review”, or “block”. Copy number 
data were excluded via manual review of all selected copy number solutions. Manual review consisted of identifying whether data had an 
atypical or noisy segmentation profile. While we recognize that this strategy is not objective it proved to be an effective strategy for 
identifying poor performing samples. Insufficient signal, noisy signal, TITAN run fail and unexpected genome stability (little to no copy number 
changes observed suggesting low purity) were the main reasons for sample exclusion or review. Clinical data was another source of sample 
filtering. Exclusion of samples was mostly related to sample pairs where surgical interval was very short (1-2 months) and thus did not appear 
to be a true recurrence. Caution should be used when considering whether a sample represents a true recurrence as no standard set time 
limits exist. Categories for clinical data include “allow”, “interval 1 or less months”, “interval 2 or less months”, “different location” and 
“surgical indication” (including “further debulking”). Those interested in using the dataset for further analysis are encouraged to make their 
own judgments on the criteria they select. The Silver set is filtered to include those pairs with no fingerprinting mismatches and sufficient 
coverage and is made up of 257 pairs. The Gold set contains 222 pairs, which in addition to the previously mentioned criteria also contain 


acceptable CNV calls in both samples. 


Replication Replication was limited to select patient samples where both whole genome sequencing and whole exome sequencing was available. All attempts 
at replication were successful. 
Randomization — There was no randomization in this study. 


Blinding All patient samples were deidentified and were assigned a study-specific barcode. Blinding was not relevant to our study since there was no 
randomization of groups. 
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system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


7 1290129 


Materials & experimental systems Methods 


n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
|] Palaeontology | MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Human research participants 


Policy information about studies involving human research participants 


= 
red) 
a 
(S 
= 
a 
= 
O 
Za) 
© 
red) 
a 
(a 
=F 
= 
© 
zo) 
fo) 
= 
=e 
a 
Wn 
ec 
3 
=: 
red) 
s 
< 


Population characteristics The dataset includes 271 sets of at least two time-separated tumor samples and 17 standalone recurrences. The majority of sets 
contain two tumor samples (n=246, 85%), with 19 (6.6%) three-tumor sample sets, three (1%) four-tumor sample sets, one 
(0.3%) with a total of five tumor samples and 17 (5.9%) standalone post-treatment tumor samples. Basic clinical information 
including age (years), gender, overall survival (months), tumor grade, and tumor histology was available for 90% (260/288) of 
patients and for 92% (536/584) of tumor samples of the dataset. 


Temozolomide and radiation treatment information was available for 68% of the cohort (399/584), data on other treatment 
modalities was available for 119 patients. Median age at diagnosis of GLASS patients in the IDHmut-noncodel and IDHmut-codel 
subtypes were both 34 years old and in the IDHwt group age at diagnosis was 53 years old. This is compared with 46 years for 
|DHmut-codels, 38 years for the IDHmut-noncodels and 59 years in the TCGA cohort respectively. Patients in our dataset were 
biased toward longer survival as 261 patients were deemed fit for surgical resection or biopsy at recurrence. Median survival for 
primary glioblastoma patients was 21 months (95% Cl 19-23) in the GLASS cohort versus 15 months in historical cohorts. 
Patients in this cohort were predominantly treated at teaching/academic centers, which have been shown to be an independent 
predictive factor of longer survival compared with non-teaching/community hospital settings 


All other relevant patient demographics for the GLASS cohort are presented in the Supplement. 
Recruitment Informed consent was obtained from all study subjects as part of each institution's individual IRB. 


Ethics oversight All tissue source centers listed in Supplementary Table 1 obtained study approval by the corresponding institutional review board (IRB) and 
informed consent from all patients in the cohort. Data pooling at the Jackson Laboratory was performed under the oversight of the IRB at the 
Jackson Laboratory. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Policy information about clinical studies 
All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions. 


Clinical trial registration NA. 
Study protocol NA. 
Data collection NA. 


Outcomes NA. 
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In the Drosophila brain, ‘compass’ neurons track the orientation of the body and head 
(the fly’s heading) during navigation 7. In the absence of visual cues, the compass 


neuron network estimates heading by integrating self-movement signals over time**. 
When a visual cue is present, the estimate of the network is more accurate’. Visual 
inputs to compass neurons are thought to originate from inhibitory neurons calledR 
neurons (also knownas ring neurons); the receptive fields of R neurons tile visual 
space®. The axon of each R neuron overlaps with the dendrites of every compass 
neuron®, raising the question of how visual cues are integrated into the compass. 
Here, using in vivo whole-cell recordings, we show that a visual cue can evoke synaptic 
inhibition in compass neurons and that R neurons mediate this inhibition. Each 
compass neuron is inhibited only by specific visual cue positions, indicating that 
many potential connections from R neurons onto compass neurons are actually weak 
or silent. We also show that the pattern of visually evoked inhibition can reorganize 
over minutes as the fly explores an altered virtual-reality environment. Using 
ensemble calcium imaging, we demonstrate that this reorganization causes 
persistent changes in the compass coordinate frame. Taken together, our data 
suggest a model in which correlated pre- and postsynaptic activity triggers 
associative long-term synaptic depression of visually evoked inhibition in compass 
neurons. Our findings provide evidence for the theoretical proposal that associative 
plasticity of sensory inputs, when combined with attractor dynamics, can reconcile 
self-movement information with changing external cues to generate a coherent sense 


of direction’ ™. 


The compass neurons in the Drosophila brain exhibit some resem- 
blance to the head-direction cells of the mammalian brain? *. Visual 
cues stabilize the tuning preferences of mammalian head-direction 
cells®, and when a visual cue is rotated to a new horizontal position, 
the preferences ofall of the head-direction neurons rotate together"*”*, 
It has been proposed that the mammalian head-direction system rep- 
resents a ring attractor—a network in which global dynamics exhibit 
multiple stable states that unfold in a repeated sequence in response 
to an input”””"®. However, we do not know how visual cues anchor the 
mammalian head-direction system at a mechanistic level. It has been 
suggested that Hebbian synaptic plasticity of visual inputs enforces the 
correct mapping between sensory cues and attractor network states’. 

Similar to mammalian head-direction cells, Drosophila compass 
neurons (called E-PG neurons) have properties of a ring attractor’. 
Indeed, the dendrites of E-PG neurons are arranged ina ring in the 
brain (Fig. 1a). At any point in time, there is one ‘bump’ of activity in the 
E-PG ensemble, which rotates as the fly turns’. This network receives 
continuous input from brain regions that track the rotational velocity 
of the fly via optic flow signals, proprioceptive signals and/or motor 
efference signals?*. These rotational velocity inputs push the bump 
around the circle. Visual cues make the position of the bump more 
accurate and stable’. We do not know whether visual inputs to E-PG 


neurons are plastic: the offset between the E-PG bump and the visual 
world is different in different individuals and it can occasionally change 
unpredictably within an individual’*; however, network instability alone 
does not provide evidence for synaptic plasticity. 

The anatomy of R neuron axons is another reason to suspect the exist- 
ence of synaptic plasticity in this network. Each R neuron axon overlaps 
with the dendrites of every E-PG neuron (Fig. 1b). If all these R-to- 
E-PG connections were functionally equivalent, information about 
the position ofa visual cue would be discarded. Instead, it seems more 
likely that the all-to-all matrix of R-to-E-PG anatomical connections 
(Fig. 1c) represents a set of potential functional connections that can 
be repatterned during spatial learning. We therefore set out to test two 
hypotheses-—first, that individual E-PG neurons respond selectively 
to specific visual cue positions and, second, that changes in visual- 
heading associations can trigger systematic, time-locked changes in 
the pattern of E-PG visual inputs. 

Our first challenge was to isolate the synaptic input to E-PG neu- 
rons that is related to visual cue position, separate from the synaptic 
input related to the rotational velocity of the fly. We reasoned that this 
should be possible if we flashed visual cues transiently at randomized 
positions, preventing the fly from behaviourally fixating the stimu- 
lus. We therefore performed in vivo whole-cell recordings from E-PG 
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same regardless of the fly’s behaviour. Cue flash is 500 ms. f, Mean rotational 
velocity around cue presentation (black), +1s.d. (grey) across flies. Magenta 
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randomizing cue positions, Bonferroni-corrected; because the mean lies 
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of visual receptive fields of E-PG neurons (73 neurons in 68 flies). Cells are 
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response. Histogram shows the number of cells preferring each cue position. 
Some cells were filled to determine their location. h, Cue position eliciting peak 
inhibition versus neuron location (no significant correlation: circular 
correlation coefficient*° = 0.097, P= 0.66, n= 21; see Extended Data Fig. 2g). D, 
dorsal; V, ventral. 


neurons while flashing a bright vertical bar ona dark circular panorama 
at randomized horizontal positions (Fig. 1d). In a typical neuron, we 
observed hyperpolarization that was time-locked to flashes at specific 
positions (Fig. le). To verify that these neural responses are not related 
to the rotational velocity of the fly, we analysed the movement of the 
air-cushioned ball that the fly was standing on (Extended Data Fig. 1). 
Neural responses were unrelated to the rotational velocity of the flies 
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Fig. 2| Visual receptive fields of E-PG neurons align with heading tuning. 
a, Interleaved blocks measuring visual receptive fields and heading tuning. 
b, Top, E-PG voltage during a VR epoch. Bottom, VR heading. A heading of 0° 
means the cue isin front of the fly.c, Comparison of visual receptive fields 
(blue) and heading tuning (red) from three example E-PG neurons (from three 
flies, with Pearson’s correlation coefficients). d, Pearson’s correlation 
coefficients from 40 cells in 39 flies (all cells from Fig. 1). The mean and 95% 
confidence interval are shown as horizontal and vertical lines, respectively. 
The means of the data are outside the 95% confidence interval of a bootstrap 
distribution (grey violin plot) computed on randomized visual-heading 
pairings. 


around the time of the visual flash (Fig. le), and there was no correlation 
between the rotational velocity and the flash (Fig. 1f). Therefore, we 
can interpret visually locked responses as synaptic inputs related to 
visual cue position. We call this the visual receptive field of the cell. The 
finding of visually evoked hyperpolarization is consistent with the fact 
that R neurons release the inhibitory neurotransmitter y-aminobutyric 
acid (GABA)??”°. 

Inalmost every E-PG neuron, we found that some visual cue positions 
elicited hyperpolarization while other positions elicited no hyperpo- 
larization (Fig. le, g). This suggests that each E-PG neuron receives 
relatively strong input from some R neurons but weak or non-existent 
input from other Rneurons. In approximately half of the E-PG neurons, 
we also found that some cue positions elicited depolarization (Fig. 1g). 
Depolarization may represent disinhibition: because there is ongoing 
mutual inhibition between E-PG neurons’, a visual cue that inhibits 
one E-PG neuron will disinhibit other E-PG neurons. 

We found that different E-PG neurons had distinct visual receptive 
fields (Fig. 1g). When we sorted cells by the position that elicited the 
most positive (least negative) response, we found a uniform mapping of 
cue positions onto E-PG neurons (Fig. 1g). However, cue-evoked hyper- 
polarization was more prominent for lateral cue positions (Extended 
Data Fig. 2); this spatial bias is probably inherited from R neurons, 
because the receptive fields of R neurons are similarly biased towards 
lateral positions>. 

When we managed to record sequentially from two adjacent E-PG 
neurons inthe same brain, we found that they had adjacent receptive 
fields, as expected (Extended Data Fig. 3). However, when we pooled 
data across brains, we found no systematic relationship between the 
location of the dendrites of the E-PG neuron and its receptive field 
(Fig. 1h). Therefore, the mapping from visual space to compass coor- 
dinates is different across individuals. 

Next, we investigated how the visual receptive field of an E-PG neu- 
roncompares with its heading tuning. To measure heading tuning, we 
allowed the fly to walk in closed-loop virtual reality (VR) in which the 
horizontal position of the cue was locked to the virtual heading of the 
fly (Fig. 2a, b). We periodically paused VR to map the visual receptive 


a ma 


7 8 
n 
n 
R g 
neuron a 
no 


-120 -60 0 60 
Visual cue position (°) 


Visual cue, 500 ms 


b o ChR No ChR z a ee 
& ChR § ag =. 
Any WAN yg af a 
~ N oe 
& : 
g -20 
5 mV © 
1s ES i: 
e T _309 
ChR NoChR 
c R neurons hyperpolarized with Kir2.1 ‘ . 
nes —_—_————— | vee 
~ — oer 
== — SE : 
= — s 6 -ol 7 5 Fs eo 
Le | s 
2s ° e % 
e Genetic controls (no Kir2.1 expression) gS = “7 
= = 25-3 e = 
ae ee 
ga : 
a 2-4 ; 
5 = 5S : 
Visual response bd 
(mv) mai 


-12! -60 0 60 120 
Visual cue position (°) 


AN, \ 
— Ve gh vv 
yeep yy _ | ot ot oe of 
& 


Fig. 3|R neurons drive visually evoked inhibition in E-PG neurons. a, Left, 
visually evoked spike rates in an R neuron (mean +s.e.m. across trials, n=5-6 
trials, R2 neurons). Right, four responses to repeated presentation of the best 
cue position for this neuron. We observed spatially tuned responses in 3 out of 
7R2cells and 1 out of 3R4d cells; an additional 3 R2 cells and1R4d cell 
responded to full-field illumination but were unresponsive to the cue or not 
spatially tuned. b, Left, responses of an E-PG neuron to optogenetic activation 
of R2 neurons via Chrimson (ChR), with four single trials in grey, meanin black. 
Middle, same as for the left panel, but with no Chrimsonin R neurons, n=4 
trials. Right, summary of mean evoked hyperpolarization with ChrimsoninR 
neurons (squares, R2 neurons, n=7; triangles, R4d neurons, n=4) and controls 
(n=5).c. Left, E-PG visual receptive fields in flies in which R neurons were 
chronically hyperpolarized using Kir2.1 expression driven by R54E£12-Gal4 or 
R20A02-Gal4 (green shades) versus controls (R54£12-Gal4 only, RZ0A02-Gal4 
only, UAS-Kir2.1 only, grey shades). Right, summary of peak visually evoked 
hyperpolarization, colour-coded as in the left panel (horizontal lines are 
means; n=9,10,12,10, 8 cells, from left to right; RS4F12 Kir2.1 versus R54F12/+ 
and UAS/+, P=0.021and P=0.0016, respectively; R20A02 Kir2.1 versus 
R20A02/+ and UAS/+, P= 0.0046 and P=0.012, respectively; two-sided 
Wilcoxon rank-sum tests). 


field of the same neuron using brief random flashes. In most neurons, 
we found that the visual receptive field was correlated with heading 
tuning (Fig. 2c, d and Extended Data Figs. 3, 4). This result is notable 
because heading tuning reflects not only synaptic inputs related to 
visual cue position, but also synaptic inputs related to the rotational 
velocity ofa fly. Imperfect alignment between these inputs may explain 
why some neurons showed poor correlations (Fig. 2d). 

Toconfirm that R neurons are the actual source of visual responses in 
E-PG cells, we focused on two R neuron types (R2 and R4d) that respond 
to sparse visual cues>. First, we used whole-cell recordings to confirm 
that these R neuron types can be excited by the visual cue (Fig. 3a). 
Second, we verified that optogenetically activating either R2 or R4d 
neurons inhibits E-PG neurons (Fig. 3b). Third, we established that R 
neurons are required for normal visually evoked hyperpolarizationin 
E-PGneurons. We used two independent driver lines to hyperpolarize 
R2 or R4d neurons by overexpressing the potassium channel Kir2.1 
(Extended Data Fig. 5), and we confirmed that visually evoked hyper- 
polarization was attenuated (Fig. 3c and Extended Data Fig. 6) in both 
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Fig. 4 | Visuomotor experience can persistently change E-PG 

ensemble representations of heading direction. a, E-PG ensemble GCaMP6f 
signals. Here the circular E-PG ensemble has been linearized, with each row 
showing eight sectors of the ensemble. The fly walked ina one-cue 
environment (pre-training, =10 min), thenatwo-cue environment (training, 
20 min), and finally aone-cue environment (post-training, >4 min). Three 
snippets of one experiment are shown. Brackets mark 360° turns when the 
bump skipped over half the ensemble. b, Inthe same experiment, VR heading 
(red) overlaid with the decoded neural representation of heading (blue). We 
double-plotted both traces and shifted the entire red trace horizontally so that 
it overlapped with the blue trace during pre-training. c, The offset of the 
decoded neural representation of heading relative to VR heading, double- 
plotted. The circular mean during pre-training is marked witha vertical line 
(defined as offsety). d, Offset probability histograms during each block, for 
seven example experiments. We found diverse values of offset, in different 
flies, as reported previously’, but for display we horizontally aligned all offset, 
values in different flies. The opposing range is defined as the range from 
(offset, + 90°) to (offset, - 90°). e, Total offset probability in the opposing 
range. Each set of connected points is one experiment (n=19 flies). Training 
and post-training are both significantly different from pre-training 

(P=3.8 10° training versus pre-training and P= 5.31 x 10 © post-training versus 
pre-training, two-sided exact paired Wilcoxon signed-rank tests). 


genotypes. A few E-PG neurons still showed some visual responses, 
probably because neither driver line achieved complete coverage of 
R2 and R4d neurons (Extended Data Figs. 5, 6). 

Next, we turned to our second hypothesis—that changes in visual- 
heading associations can trigger systematic, time-locked changes in 
the visual receptive fields of E-PG neurons. After allowing the fly to 
navigate in VR with one visual cue (the pre-training block), we switched 
to VR with two cues positioned 180° apart (the training block). In the 
training block, a full turn and a half-turn will arrive at an identical view of 
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Fig. 5 | Visuomotor experience can remap visual input to E-PG neurons 
contingent on postsynaptic activity. a, After the fly navigated in VR with one 
cue (pre-training), we measured the visual receptive field of an E-PG neuron 
(first probe). Then the fly navigated in VR with two cues for 12 min (training) 
and we again measured the visual receptive field (second probe). b, Five 
example neurons. For each neuron, the red solid curve is heading tuning. Red 
dashed curve is the change in heading tuning (training minus pre-training). 
Blue solid curve is the visual receptive field. Blue dashed curve is the change in 
the visual receptive field (second probe minus first probe). Arrowheads mark 
large changes. Black vertical scale bar applies to all datain b. Thin black 
horizontal lines indicate zero change (A= 0 mV). Neuron 5 is an example with 
little modulation by heading during training and little change in visual 
receptive field. c, Explanation of metrics in d-f. d, Absolute change in visual 
receptive field, versus change in receptive field shape (R?= 0.44, P= 0.00078 
testing t-statistic slope # 0; 22 E-PG neurons in 22 flies). e, Absolute change in 
visual receptive field post-training (22 E-PG neurons in 22 flies) versus controls 
(17 E-PG neurons in17 flies). Post-training flies are significantly different from 
control flies (P= 0.043, two-sided Wilcoxon rank-sum test). Controls walked in 
aone-cue VR (not two-cue VR) between the first and second probe. Four 
training experiments had changes significantly larger than any controls (>2s.d. 
above control mean, vertical bar); these are neurons 1-4. f, Absolute change in 
visual receptive field, versus modulation by heading during training (R?=0.52, 
P=0.00016 testing t-statistic slope 0; 22 E-PG neurons in 22 flies). 

g, Schematic of model. Whena visual cue appears, it activates specificR 
neurons (highlighted magenta cell) and this pushes the bump towards the E-PG 
neuron with minimal inhibition (highlighted grey cell). Training changes 
R-to-E-PG weights so that the bump toggles between two offsets during post- 
training. Rneurons are ordered by receptive field position, and E-PG neurons 
are ordered by preferred heading direction (arrows). 
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the world, meaning the correlation between rotational velocity signals 
and visual cue position signals will be altered. 

To assess the effect of training on network dynamics, we imaged 
calcium signals from the entire E-PG ensemble (Fig. 4a). During pre- 
training, there was a stable offset between the visual environment 
and the E-PG bump (Fig. 4b, c). During training, the offset toggled 
between two values approximately 180° apart. This result is expected, 
because there are two equally valid interpretations of the visual scene, 
yet only one bump can exist in the E-PG ensemble”. When the fly made 
a360° turn, we often saw the bump flow twice around 180° of the E-PG 
ensemble, skipping over the other 180° (Fig. 4a—c). Rotational velocity 
inputs to the E-PG network should drive the bump to traverse the full 
circle during a full turn**; the skipping-over phenomenon thus indicates 
the dominance of visual position inputs over angular velocity inputs. 
The E-PG neurons that were traversed twice essentially displayed two 
preferred heading directions; this is reminiscent of the finding that 
some rat head-direction cells show two preferred directions in an envi- 
ronment with twofold rotational symmetry”. 

Upon returning to a one-cue environment (post-training), the offset 
sometimes immediately settled into its original value. Often, however, 
this was not the case. Rather, the offset continued to toggle for several 
minutes, or else it immediately settled in a new value rather than the 
original one (Fig. 4d, e). Both of the latter two outcomes suggest a 
persistent, systematic change in the way that visual cues are mapped 
onto E-PG neurons. We observed one of the latter outcomes in about 
half of our experiments (Fig. 4e and Extended Data Fig. 7). 

Finally, to investigate whether training changes visual receptive 
fields, we returned to E-PG whole-cell recordings (Fig. 5a). We began 
each experiment with one visual cue in VR (pre-training). We then 
switched to two visual cues in VR (training). Between each block of 
VR, we periodically paused to map the receptive field of the neuron 
using brief random flashes. Whereas we used a 360° panorama during 
calcium imaging, the spatial constraints of electrophysiology required 
us to map the 360° environment onto a 270° panorama’”. 

During the training block, we found that some E-PG neurons were 
strongly modulated by the fly’s heading. In these neurons, training pro- 
duced changes inthe visual receptive field (Fig. 5b, neurons 1-4). These 
changes were bidirectional, suggesting that visually evoked inhibition 
was depressed for some cue locations and potentiated for others. We 
quantified these changes by summing the absolute value of the change 
inthe receptive field across all cue positions (absolute change; Fig. Sc). 
We also measured the changein the shape of the receptive field (Fig. 5c). 
These metrics were correlated across experiments (Fig. 5d); we never 
saw a large absolute change in the receptive field without a change in 
receptive field shape. We also never observed large changes inthe recep- 
tive field under control conditions in which flies only experienced one 
cue in VR (but not two cues) during the period between the receptive 
field mapping epochs (Fig. 5e, Extended Data Figs. 8, 9). 

By contrast, other E-PG neurons were essentially unmodulated by the 
fly’s heading during training (Fig.5b, neuron5). These neurons may reside 
in sectors of the ensemble that were skipped over by the bump during 
training. Notably, training had almost no effect on visual receptive fields 
inthese neurons (Fig.5b and Extended Data Fig. 8). Overall, the magnitude 
of heading modulation during training was significantly correlated with 
the subsequent change in the visual receptive field (Fig. 5f). This correla- 
tionindicates that remapping depends onthe activity of the E-PG neuron. 
Simply exposing the fly tothe altered visual environmentis not sufficient; 
rather, visual cues must intersect with heading representations in E-PG 
neurons. Because R-to-E-PG synapsesare the site of intersection between 
visual responses and heading representations, they are the most likely 
locus of plasticity. Ina companion study, Kim et al.” used optogenetic 
manipulations to reach the same conclusion. Because R neuron dendrites 
formaretinotopicmap thatis fairly consistent across flies®, it seems unlikely 
that the visual mapinR neuron dendrites is experience-dependent, further 
supporting the notion that R-to-E-PG synapses are the locus of plasticity. 


Discussion 


We propose that correlated pre- and postsynaptic activity triggers 
associative long-term synaptic depression of R-to-E-PG inhibition. 
This learning rule would explain why visual receptive fields and head- 
ing tuning are typically aligned in E-PG neurons. When an individual R 
neuronis activated by a visual cue, it should push the bump of activity 
towards the E-PG neurons that it inhibits most weakly (Fig. 5g). If the 
full ring attractor network agrees with this outcome, then long-term 
synaptic depression will occur and those weak R-to-E-PG synapses 
will become even weaker, further reinforcing this outcome. To ensure 
network stability, long-term synaptic depression should be balanced 
by long-term potentiation at R-to-E-PG synapses; the co-existence of 
long-term synaptic depression and long-term potentiation would also 
explain why we found bidirectional changes in visual receptive fields 
after training (Fig. 5b). These learning rules should produce a doubled 
pattern of R-to-E-PG synaptic weights after training in atwo-cue world 
(Fig. 5g), reflecting the twofold symmetry of visuomotor correlations. 

The key result of this study—that visual inputs to E-PG neurons are 
plastic—supports theoretical models that describe how anetwork can 
progressively establish a spatial map of the world by incorporating 
information about consistent sensory cues during exploration’ ”. In 
robotics, this process is called simultaneous localization and map- 
ping”*. Our results provide direct experimental evidence for this type 
of unsupervised learning at the level of synaptic potentials in vivo. 

Inasimultaneous localization and mapping framework, visual cues 
are often local, meaning that they can change in size and apparent angle 
as they are approached; by contrast, we chose to use visual cues that 
could not be approached, simplifying the relationship between head- 
ing and visual cues. This choice was motivated by the known receptive 
field properties of R2 or R4d neurons, which seem adapted to detect 
the position of the Sun (or Moon). Specifically, R2 or R4d neurons have 
large inhibitory surrounds, meaning that they only respond robustly to 
isolated visual objects’ such as the Sun. The Sun is an ideal compass 
cue because it is effectively at infinity”. 

We propose that plasticity at R-to-E-PG synapses allows the position 
of the Sun to be flexibly associated with other compass cues, such as 
the pattern of linearly polarized light in the sky”’, sky-wide chromatic 
and intensity gradients”’”’, and wind”. In other insects, the E-PG 
network responds to multiple sorts of compass cues”, and naviga- 
tion behaviour can depend on arbitrary learned associations between 
compass cues**®. Ina companion study, Kim et al.” provide evidence 
in favour of the idea that plasticity could be used to learn a complex 
conjunction of visual objects; in the future, to test this idea, it will be 
interesting to see whether any complex scene can generate a progres- 
sively more-stable heading representation (offset) during training. It 
will also be important to extend the approach that we have taken here 
to simulate a more naturalistic virtual world, to study how multiple 
types of cues influence the behaviour of this network and the organism. 
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Methods 


Fly husbandry and genotypes 

Unless otherwise stated, flies were raised on standard cornmeal-molas- 
ses food (New Brown 19L, Archon Scientific) in an incubator ona 12-h:12- 
hlight:dark cycle at 25 °C with humidity between around 50 and 70%. 

All experiments with visual stimuli used flies with at least one wild- 
type copy of the white gene, and most electrophysiology experiments 
used flies with two copies of the wild-type white gene (as described 
below). 

The experimenter was not blind to genotype because we did not use 
genetic perturbations; the exception is Fig. 3c (Kir2.1 perturbation). 
For the dataset shown in Fig. 3c, the experimenter was blind to geno- 
type after the pilot phase for driver line R2OA02-Gal4; because Fig. 3c 
pilot data were indistinguishable from subsequent data, all data were 
ultimately pooled; overall the experimenter was blind to genotype in 
67% of these recordings. For the dataset obtained using the driver line 
R5S4E12-Gal4, the experimenter was not blind to genotype because the 
experimental genotype was obtained at a lower-than-expected (sub- 
Mendelian) frequency, making it impractical to blind the experimenter. 

Genotypes of fly stocks used in each figure are as follows. For 
Figs. 1, 2,5 and Extended Data Figs. 1-4, 8, 9, we used P{20XUAS-IVS- 
mCD8::GFP}attP40/P{20XUAS-IVS-mCD8.::GFP}attP40; PfR60D05-Gal4} 
attP2/P{R60D05-Gal4}attP2 flies. For Fig. 3a, we used P{20XUAS- 
1VS-mCD8::GFP}attP40/P{20XUAS-IVS-mCD8::GFP}attP40; P{GawB} 
FB1/+ flies. For Fig. 3b (R2 activation), we used w/+; P{R19CO8-lexA} 
attP40/P{20XUAS-IVS-mCD8::GFP}attP40; PBac{13xLexAop2-IVS- 
Syn21-Chrimson::tdT-3.1} VKOOOOS5/P{R60DO05-Gal4/attP2 flies. 
For Fig. 3b (R4d activation), we used w/+; P{ R6ODOS-lexAjattP40/ 
P{13XLexAop2-mCD8::GFP}attP40; Pf{20XUAS-CsChrimson-tdTomato} 
VKOOO0S5/P{R12B01-Gal4}attP2 flies. For Fig. 3b (no Chrimson), we 
used P{20XUAS-IVS-mCD8::GFP}attP40/P{20XUAS-IVS-mCD8::GFP} 
attP40; PfR60D05-Gal4jattP2/P{R6ODOS5-Gal4}attP2 flies. For Fig. 3c 
and Extended Data Fig. 6 (Kir2.1 silencing, driver 1), we used +/w; 
P{R60DOS5-lexA}attP40/P{13XLexAop2-mCD8::GFP/attP40; P{R2ZOAQ2- 
Gal4}attP2/P{UAS-Hsap\KCNJ2.eGFP} flies. For Fig. 3c and Extended Data 
Fig. 6 (Kir2.1 silencing, driver 2), we used +/w; P{R6ODOS-lexAfattP40/ 
P{13XLexAop2-mCD8::GFP}attP40; P{RS4E12-Gal4}attP2/P{UAS-Hsap\ 
KCNJ2.eGFP} flies. For Fig. 3c and Extended Data Fig. 6 (UAS-only con- 
trols), we used +/w; P{R6ODOS-lexA}attP40/P{13XLexAop2-mCD8.:GFP} 
attP40; +/P{UAS-Hsap\KCN]2.eGFP?}3 flies. For Fig. 3c and Extended 
Data Fig. 6 (Gal4-only controls), we used +/w; P(GMR60DOS5-lexA} 
attP40/P{13XLexAop2-mCD8::GFP/attP40; +/P{R2Z0A02-Gal4/attP2 
flies and +/w; P(GMR60DO05-lexA}attP40/P{13XLexAop2-mCD8::GFP} 
attP40; +/P{R54E12-Gal4}attP2 flies. For Fig. 4 and Extended Data 
Fig. 7, we used +/w; P{UAS-GCaMP6f}attP40/+; P{R6ODOS5-Gal4} 
attP2/+ flies. For Extended Data Fig. 5, we used R57C10-FLPGS.PEST; 
UAS(FRT.stop)myr::smGdP-HA, UAS(FRT.stop)myr::smGdP-VS5, UAS(FRT. 
stop)myr::smGdP-Flag/R20A02-Gal4, RS7C10-FLPGS.PEST; UAS(FRT. 
stop)myr::smGdP-HA, UAS(FRT.stop)myr::smGdP-VS5, UAS(FRT.stop) 
myr::smGdP-Flag/R54E12-Gal4 flies. 


Origins of transgenic stocks 

The following GMR Gal4 lines were obtained from the Bloomington 
Drosophila Stock Center (BDSC) and were described previously”: 
P{R6ODOS-Gal4}attP2, P{R6ODOS5-lexA}attP40, P{R19CO8-lexA}attP40, 
P{R12B01-Gal4}attP2, P{R54E12-Gal4}attP2, PfR2ZO0A02-Gal4}attP2. The 
P{GawB}EB1 line was also obtained from the BDSC and was described 
previously’®. 

P{20XUAS-IVS-mCD8.:GFP}attP40 was a gift from B. Pfeiffer and 
G. Rubin and was described previously”. P{13XLexAop2-mCD8::GFP} 
attP40 was obtained from the BDSC and was described previously”. 
PBac{13xLexAop2-IVS-Syn21-Chrimson::tdT-3.1} VKO0005 was a gift from 
B. Pfeiffer and D. Anderson and was described previously*°. P{20X-UAS- 
CsChrimson-tdTomato}VKOO005 was a gift from). Tuthill who obtained 


it from B. Pfeiffer. (Note that we have confirmed that this CsChrimson 
insert is onthe third chromosome, but it may not bein VKOOO0S, given 
the recombination frequencies observed in our laboratory. We have 
confirmed that this insertion does generate tdTomato expression 
and light-evoked currents in Gal4* cells.) P{UAS-Hsap\KCNJ2.eGFP}7 
was obtained from the BDSC and was described previously“. P{UAS- 
GCampé6f}attP40 was obtained from the BDSC through T. Clandinin 
and was described previously”. 

Transgenes for MultiColor FlpOut were obtained from the 
BDSC and were described previously”, including w/1118] Pfy[+t7.7] 
w[+mC] = GMRS57C10-FLPGS.PEST}su(Hw)attP8; PBac{y[+mDint2] and 
w[+mC] = 10xUAS(FRT.stop)myr::smGdP-HA}VKOOOOS Pfy[+t7.7] and 
w[+mC] = 10xUAS(FRT.stop)myr::smGdP-V5-THS-10xUAS(FRT.stop) 
myr::smGdP-Flag}su(Hw)attP1. 


Fly preparation and dissection 

Newly eclosed virgin female flies were anaesthetized on ice (electro- 
physiology) or CO, (imaging) and were collected around 3-10 h (elec- 
trophysiology) or 12-26 h (imaging) before the experiment. In some 
cases, to promote walking behaviour, we deprived the flies of food 
(but not water) for approximately 3-10 h before the experiment, and 
experiments were performed around the subjective evening of the fly 
(+2h from light to dark switch, Zeitgeber time 12); this was donein Fig. 5 
and in 72% of recordings in Fig. 2. In all other experiments, there was not 
circadian restriction and flies were kept on food until the dissection. 
At the beginning of each dissection, the fly was cold-anaesthetized. 

For electrophysiology experiments, the preparation holder con- 
sisted of flat titanium foil secured to an acrylic platform, with the foil 
oriented parallel to the horizontal body plane; the fly’s head and body 
were gently pushed partway-through a hole in the foil. For E-PG neuron 
electrophysiology, the head was pitched forward so that the posterior 
surface was roughly parallel to the foil and most of each eye was under 
the foil. For Rneuron electrophysiology, the head was positioned ina 
more upright angle, and a 90° bend was made in the foil to maximize 
the area of the eyes that was under the foil. For imaging experiments, 
the preparation holder was shaped like an inverted pyramid and was 
CNC machined from black acrylic (Autotiv), and the head was pitched 
forward so that the posterior surface was oriented dorsally and most of 
the eye was under the holder. The fly was always secured in the holder 
with epoxy (Loctite AA 3972) and cured using a brief (<1-s) pulse of 
ultraviolet light (LED-200, Electro-Lite Co). Wings were sometimes 
repositioned or removed. After the dorsal head was covered in saline, 
a hole was cut in the head capsule and some trachea were removed to 
expose the brain area of interest. To reduce brain movement, muscle 16 
was removed, the proboscis was removed (Figs. 1-3, 5) or glued (Fig. 4) 
and the oesophagus was clipped or removed (Fig. 4). For electrophysiol- 
ogy, an aperture was made in the perineural sheath around the somata 
of interest either by ripping gently with fine forceps or by using suction 
froma patch pipette filled with external solution. 

The external solution contained (in mM): 103 NaCl, 3 KCI, 5 
N-tris(hydroxymethyl) methyl-2-aminoethane-sulfonic acid, 8 tre- 
halose, 10 glucose, 26 NaHCO,, 1 NaH,PO,, 1.5 CaCl, and 4 MgCl,, with 
osmolarity adjusted to 270-273 mOsm. External solution was bubbled 
with 95% O, and 5% CO, and reached a final pH of 7.3. External solution 
was continuously perfused over the brain during electrophysiology 
and before imaging. 


Patch-clamp recordings 

Patch pipettes were made from borosilicate glass (Sutter, 1.5 mm 
o.d., 86 mmi.d.) using a Sutter P-97 puller. For E-PG recordings, the 
pipette was fire-polished after pulling** using a microforge (ALA 
Scientific Instruments) to achieve a final resistance of 8-15 MQ. For 
R neuron recordings, pipettes (4-10 MQ) were not fire-polished. 
The internal solution contained (in mM): 140 potassium aspartate, 
10 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid, 4 MgATP, 


0.5 Na,GTP, 1 ethylene glycol tetraacetic acid, 1 KCI and 13 biocytin 
hydrazide. The pH was 7.3 and the osmolarity was adjusted to approxi- 
mately 268 mOsm. To encourage walking, the external solution was 
heated before the experiment* to around 25-32 °C; this was done for 
all recordings in Fig. 5, 65% of recordings in Fig. 2 and 42% of recordings 
in Fig. 1. All other recordings were performed using external solution 
at room temperature. 

To obtain patch-clamp recordings under visual control, we used an 
Olympus BX51WI microscope with a 40x water-immersion objective. 
Neurons were identified as GFP* using a Hg-lamp source (U-LH100HG, 
Olympus) with an eGFP long-pass filter (U-N41012, Chroma). For experi- 
ments in which the fly was positioned on a foam ball, farred light was 
delivered from a fibre-coupled LED (740 nm, M740F2, Thorlabs) 
through a ferrule patch cable (200 pm core, Thorlabs) plugged intoa 
fibre-optic cannula (1.25 mm SS ferrule 200 um core, 0.22 NA, Thorlabs) 
glued to the recording platform, with the tip of the cannula around 
1cm behind the fly. In experiments without the ball, the brain was illu- 
minated with 780 nm light via the microscope condenser, and after 
the recording was obtained, the condenser was lowered to prevent it 
from obscuring the fly’s view of the visual panorama. 

Recordings were obtained using an Axopatch 200B amplifier and a 
CV-203BU headstage (Molecular Devices). Voltage signals were low- 
pass filtered at 5 kHz before digitalization and then acquired witha 
NiDAQ PCI-6251 (National Instruments) at 20 kHz. Liquid junction- 
potential correction was performed post hoc by subtracting 13 mV 
from recorded voltages**. 


Two-photon calcium imaging 

Imaging experiments were performed using a two-photon microscope 
with a moveable stage (Thorlabs Bergamo Il) and a fast piezoelectric 
objective scanner (Physik Instrument P725) for volumetric imaging. 
For two-photon excitation, we used a Chameleon Vision-S Ti-Sapphire 
femtosecond laser tuned to 940 nm. Images were collected using a 
20x/1.0 NA objective (Olympus). Emission fluorescence was filtered 
with a 525-nm bandpass filter (Thorlabs) and collected using a GAsP 
photomultiplier tube (Hamamatsu). 

The imaging region was centred onthe protocerebral bridge, which 
is the brain region in which E-PG neuron axons terminate. For E-PG 
neurons, there is an orderly and stereotyped mapping from the loca- 
tion of the dendrite of the cell to the location of its axon terminal in 
the protocerebral bridge*’. Following previous studies**, we chose to 
image E-PG axons rather than dendrites because the axons are more 
superficial, and so more optically accessible. The imaging region was 
256 x 128 pixels, and 8-12 slices deep in the z axis (3-5 tm per slice), 
resulting in a 6-9 Hz volumetric scanning rate. 

Volumetric z-scanning signals from the piezoelectric objective scan- 
ner were acquired simultaneously with analogue output signals from 
the visual panorama and analogue outputs from FicTrac via a NiDAQ 
PCI-6341 at 4 kHz. Two-photon calcium imaging data were acquired 
using ScanImage 2018 (Vidrio Technologies) with National Instruments 
hardware provided by Vidrio (NI PXle-6341). 


Measurement of locomotion 

In all experiments except those shown in Fig. 3a, b, the fly stood ona 
9-mm ball made of white foam (FR-4615, General Plastics) painted with 
black shapes. The ball floated above a plenum made of opaque ABS- 
like plastic (Figs. 1, 2, 3c, 5) or optically clear acrylic (Fig. 4) and was 
3D-printed by Autotiv. Air was flowed into the plenum at the base and 
flowed out at the top in the semi-spherical depression that cradled the 
ball. The ball was illuminated by either an infrared LED (780 nm M780L3, 
Thorlabs) witha ground glass diffuser (DG10-220-MD, Thorlabs) (Figs. 1, 
2, 3c, 5) or around board 36 infrared LED lamp (SODIAL) (Fig. 4). The 
movement of the ball was tracked at approximately 60-70 Hz using 
a video camera (Firefly MV FMVU-03MTM, Point Grey) fitted witha 
Computar Macro zoom 0.3-1x, 1:4.5 lens (Figs, 1, 2, 3c, 5) ora Tamron 


23FMO8L 8-mm 1:1.4 lens (Fig. 4). In experiments in which we used 
a 360° visual panorama (Fig. 4), the image of the ball was reflected 
to the camera using a mirror (Thorlabs broadband dielectric mirror, 
750-1,100 nm, BB1-E03) positioned below the ball. Machine vision 
software (FicTrac) converted the image of the ball to an estimate of the 
position of the ball in all three axes of rotation*’. FicTrac was modified 
to send real-time analogue measurements of all three motion axes of 
the ball toa USB DAQ (USB-3101, Measurement Computing). For closed- 
loop experiments, the yaw-position voltage signal was used to update 
the azimuthal position of the visual cues displayed on the panorama. 


Visual panorama 

Visual stimuli were presented using a circular panorama (IORodeo) 
composed of modular square panels*’. Each square panel was an 8 x 8 
array of LEDs (8 x 8 ‘pixels’) that refreshed at 372 Hz or faster”. In elec- 
trophysiology experiments, these LEDs were green (peak = 525 nm). 
In imaging experiments, these LEDs were blue (peak = 470 nm) to 
minimize overlap with GCaMPé6f emission. The vertical edge of the 
panorama was positioned approximately aligned with the vertical 
location of the fly. A single pixel along the top of the arena subtended 
around 3.6-3.7° of the visual field of the fly; this range of 0.1° is due to 
the fact that individual pixels within each flat 8 x 8 array have slightly 
different distances from the fly’s eye. A single pixel at the bottom of 
the arena subtended around 2.7°. These differences in pixel size were 
not compensated for in our experiments. 

In Figs. 1, 2, 3c, 5, we used a panorama composed of 9 x 2 panels. It 
spanned 270° azimuth and was oriented slightly asymmetrically so that 
it covered the azimuthal range from 127° left of the midline to 143° right 
of the midline. In Fig. 3a, we used a panorama composed of 6 x 2 panels 
that spanned 180° azimuth. In Fig. 4, we used a panorama composed 
of 12 x2 panels that spanned 360° azimuth. All visual panoramas were 
the same height and spanned approximately 43° vertically within the 
visual field of the fly. 

In electrophysiological experiments, to reduce electrical noise, 
the panorama was wrapped with a grounded copper mesh that was 
coloured witha black marker to reduce reflections. To further reduce 
reflections, the front surface of each panel was covered with a diffuser 
(SXF-0600 Snow White Light Diffuser, Decorative Films). In imaging 
experiments, instead of diffuser film, we used tracing paper as a dif- 
fuser, and four layers of filters (Rosco, R381, bandpass centre 440, 
full-width at half maximum of 40 nm) were used to minimize detection 
of the visual stimulus by the GCaMPéf emission collection channel. 


Open- and closed-loop modes of visual stimuli 

To map visual receptive fields, we used a bright vertical bar (2 pixels 
wide, 7°) that spanned the full height of the panorama (around 43°). 
The bar was flashed for 500 ms followed by 500 ms of darkness. During 
open-loop mode, the display updated at 50 Hz. The bar was presented 
ina pseudorandom order at 35 different evenly spaced azimuthal posi- 
tions across the screen (-120° to 135°). During each open-loop epoch, 
each bar position was used 4-5 times in total. For Rneuron recordings, 
fewer positions were used (27 positions, -139° to 56°) and each location 
was used 5-6 times in total. 

To map heading tuning curves and to provide visuomotor training 
(closed-loop mode), we used a visual panorama containing either one 
vertical bar (one-cue) or two bars positioned on opposite sides of the 
virtual world (two-cue). Each vertical bar was identical to the bar we 
presented in open-loop mode. In closed-loop mode, we controlled the 
azimuthal position of the visual pattern using the yaw-position voltage 
output from FicTrac. Between consecutive closed-loop epochs were 
3-40 of darkness, after which we shifted the pattern randomly (Fig. 4) 
or by a variable 45° or 90° increment (Figs. 2, 5) before returning to 
closed loop. Analogue output signals from the visual panel system 
and from FicTrac were digitalized with a NiIDAQ PCI-6251 (National 
Instruments) at 20 kHz (electrophysiology) or with a NiDAQ PCI-6341 
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(National Instruments) at 4 kHz (calcium imaging). In Fig. 4, the 360° 
yaw output signal was mapped directly to the 360° visual panorama. 
In Figs. 2, 5, we needed to use a 270° panorama owing to the space 
constraints imposed by the electrophysiology set-up; therefore, the 
360° yaw output signal was mapped linearly to the 270° panorama 
so that objects did not disappear when they reached one edge of the 
panorama but instead moved immediately across the gap’. Therefore, 
for example, whenever the fly made a 20° fictive right turn, the visual 
pattern would move 15° left. The exception to this is whenever the bar 
passed through the 90° gap; in that case, the bar traversed the gap 
immediately, as ifthe gap did not exist. How often this jump occurred 
varied from fly to fly depending on walking speed. We estimate that 
our most active flies experienced these 90° jumps of the cue around 
10 times per minute during a typical one-cue closed-loop trial. Note 
that in the 270° panorama, the two-cue pattern contained two bars 
spaced 135° apart. 

In pilot electrophysiology recordings, during closed-loop epochs, 
the 360° yaw output signal was mapped to 360° of visual space (rather 
than 270°). This meant that the visual cue was only displayed when it 
resided onthe 270° panorama, and the cue simply disappeared when it 
moved into the 90° sector in which the panels were missing. The head- 
ing tuning data from these 16 recordings were not included in the final 
dataset, but some open-loop visual responses from these neurons are 
included in Fig. 1g. We did not observe any systematic differences in 
the open-loop visual responses of these neurons from pilot recordings. 


Optogenetic stimulation 

Chrimson°°-expressing flies were raised on cornmeal-agar medium sup- 
plemented with rehydrated potato flakes (Carolina Biological Supply) 
mixed with 100 pl of all-trans-retinal stock solution (Sigma; 17 mM in 
ethanol). Fly vials were wrapped in foil to prevent photo-conversion of 
the all-trans-retinal. Controls in Fig. 3b were raised on molasses food 
without all-trans-retinal. For optogenetic stimulation, we used the 
Hg-lamp source (U-LH100HG) to deliver a 5-ms pulse of green light 
(530-550 nm, 2-4 mW, TRITC-Cy3 filter cube, Chroma) through the 
objective. A shutter (Uniblitz Electronic) controlled the pulse duration. 


Experimental epoch structure 

Each open-loop epoch always lasted 150 s and consisted of asequence 
of random cue flashes. Each closed-loop epoch lasted 4 min (Figs. 2, 5) 
or 2 min (Fig. 4), during which time the visual pattern was continuously 
present and rotated in proportion to the fly’s fictive yaw velocity. In 
Fig. 1, open-loop epochs were usually interleaved with 4-min one-cue 
closed-loop epochs, although occasionally two open-loop epochs were 
delivered consecutively. In Fig. 2, at least one 4-min one-cue closed- 
loop epoch was presented before obtaining a recording, and after the 
recording was obtained, open-loop epochs and 4-min one-cue closed- 
loop epochs were interleaved. In Fig. 3c, only open-loop epochs were 
presented. In Fig. 4, for pre-training, we presented at least five 2-min 
one-cue closed-loop epochs. For training, we presented ten 2-mintwo- 
cue closed-loop epochs. For post-training, we presented at least two 
2-min one-cue closed-loop epochs. In Figs. 5, 1-6 epochs of one-cue 
closed-loop experience were presented before obtaining an E-PG neu- 
ronrecording. Once the recording was obtained, the epoch structure 
was as follows. First, for pre-training, we cycled through 4-min one-cue 
closed-loop epochs alternating with open-loop epochs, for a total of 
2-6 cycles. For training, we presented three consecutive 4-mintwo-cue 
closed-loop epochs (experimental condition) or three consecutive 
4-min one-cue closed-loop epochs (matched control condition). For 
post-training, we presented one open-loop epoch. This protocol was 
followed in all training experiments in Fig. 5, with two exceptions. In 
one case, pre-training consisted of an open-loop epoch, followed bya 
closed-loop epoch, followed by another open-loop epoch (that is, 1.5 
cycles through the normal pre-training procedure). Inthe other case, 
during the closed-loop epochs before obtaining the recording, the fly 


experienced a different visual pattern that consisted of sparse randomly 
distributed single pixels (a ‘star field’ pattern), and this fly also received 
two consecutive open-loop epochs (instead of one) during pre-training. 


Immunohistochemistry 

MultiColor FlpOut. In Extended Data Fig. 5, MultiColor FlpOut (MCFO) 
was used to identify the morphological types of R neurons labelled by 
R20A02-Gal4. MCFO immunostaining was performed essentially as 
described previously*’. Primary incubation solution contained mouse 
anti-Bruchpilot (1:30, Developmental Studies Hybridoma Bank, nc82), 
rat anti-Flag (1:200, Novus Biologicals), rabbit anti-haemagglutinin 
(HA; 1:300, Cell Signaling Technologies) antibodies and 5% normal 
goat serum (NGS) in PBST. Secondary incubation solution contained 
Alexa Fluor 488-conjugated goat anti-rabbit (1:250, Invitrogen), 
ATTO 647-conjugated goat anti-rat (1:400, Rockland) and Alexa Fluor 
405-conjugated goat anti-mouse (1:500, Invitrogen) antibodies and 5% 
NGS in PBST. Tertiary incubation solution contained DyLight 550-con- 
jugated mouse anti-V5 (1:500, Bio-Rad) antibody and 5% normal mouse 
serumin PBST. 


Visualization of biocytin-filled neurons. Brains containing biocytin- 
filled neurons were processed after electrophysiological recording 
using standard procedures. Primary incubation solution contained 
mouse anti-Bruchpilot antibody (1:30, Developmental Studies Hy- 
bridoma Bank, nc82), chicken anti-GFP antibody (1:1,000, Abcam), 
Alexa Fluor 568-conjugated streptavidin (1:1,000, Invitrogen) and 5% 
NGS in PBST. Secondary incubation solution contained Alexa Fluor 
488-conjugated goat anti-chicken antibody (1:250, Invitrogen), Alexa 
Fluor 633-conjugated goat anti-mouse antibody (1:250, Invitrogen), 
Alexa Fluor 568-conjugated streptavidin (1:1,000, Invitrogen) and 5% 
NGS in PBST. 


Confocal microscopy and image analysis. Brains processed for MCFO 
were imaged using an Olympus FV1000 confocal microscope. Series 
of between 50 and 100 optical sections (1.0-m spacing) were imaged 
using either a UPLFLN 40/1.3 NA oil-immersion lens or a PLAPON 
60x/1.42 NA oil-immersion lens. Rneuron MCFO clones were classified 
into 11 subtypes according to previously published methods* based on 
the consensus of two experts. Maximum intensity z-projections were 
rendered and adjusted using cropping and thresholding tools in Fiji 
(ImageJ) and assembled into figures using Illustrator (Adobe). 

Confocal microscopy of brains processed for biocytin fills, or to 
assess expression of Kir2.1::eGFP within R neurons (Fig. 3), was per- 
formed using a Leica SP8 or Leica SPE equipped with a 40x/1.3 NA oil- 
immersion lens. Cell body counting of eGFP-labelled R neurons was 
performed independently by two experts using the Fiji Cell Counter 
plugin®, and the mean count for each brain hemisphere is reported 
(Extended Data Fig. 6). 


Data analysis 

Visual receptive fields of E-PG neurons. In Figs. 1g, 2c, 3c, and 5b 
(and Extended Data Figs. 1-4, 8, 9), visual responses were calculated 
by taking the mean voltage during the final 250 ms of the 500-ms cue 
flash, and subtracting the mean voltage during the 250 ms preceding 
the flash, averaged over all presentations of the cue at each position. 
For display, visual receptive field curves were often smoothed using 
a median filter with a width of three cue positions (Figs. 1g, 3c and Ex- 
tended Data Fig. 2) or two cue positions (Figs. 2c, 5b and Extended Data 
Figs. 3, 4, 8, 9). Peak visually evoked hyperpolarization (Fig. 3c) and 
mean visually evoked hyperpolarization (Extended Data Fig. 6) were 
calculated on the median-filtered tuning curves. 


Heading tuning of E-PG neurons. In Figs. 2, 5 (and Extended Data 
Figs. 3,4, 8, 9), heading tuning curves were calculated by first binning 
heading into 35 bins centred on the visual cue positions. The voltage 


trace was filtered using a median filter with a width of 40 ms to remove 
spikes, and the mean-filtered voltage was measured for each heading 
during an epoch. For heading tuning curves calculated from multiple 
epochs, the voltage measurement for each heading bin was weighted 
relative to the number of samples in each individual epoch and the mean 
was then taken across epochs. For display, heading tuning curves were 
often smoothed using a median filter with a width of two cue positions 
(Figs. 2c, 5b and Extended Data Figs. 3, 4,8, 9). 


Yaw during open-loop epochs. In Fig. le, the FicTrac yaw-position 
signal was unwrapped, converted into radians, low-pass filtered (But- 
terworth) at 25 Hz and differentiated to obtain angular velocity. On rare 
occasions, a value of more than 2500° s* occurred in an isolated time 
sample; this was probably due to imperfect nature of the unwrapping- 
and-differentiation procedure. These values were replaced with the 
value of the preceding sample. In Fig. 1f, the time-averaged yaw velocity 
was calculated by taking the mean yaw position during the final 250 ms 
of the flash and subtracting the mean yaw position during the 250 ms 
directly preceding the flash, and then dividing by the elapsed time 
(500 ms). We averaged data from left and right versions of the same 
cue displacement (because it seemed unlikely that alarge group of flies 
would show a systematic bias in the right or left direction) to obtain 
mean yaw velocity responses to a total of 16 cue positions for each of 73 
flies, thus obtaining 73 x 16 data points. We took the mean across flies 
at each cue position and plotted this as the black line in Fig. 1f. Next, 
to model the null case (in which visual cue position has no effect), we 
randomly drew 73 values (with replacement) from the matrix, without 
regard for cue position or fly identity, and we calculated the mean of 
these 73 values; we constructed a bootstrap distribution by repeating 
this procedure 10,000,000 times, each time calculating the mean of 
73 randomly drawn values. This bootstrap distribution was used to 
obtain a 95% confidence interval, which was then adjusted for multiple 
comparisons using a Bonferroni correction (m=16 tests). None of the 
true mean values (black) were outside this adjusted confidence interval 
(magenta lines). We used the same procedure in Extended Data Fig. le, 
except that the independent variable was the distance of the cuejump 
rather than the position of the cue. Finally, as a further control, we also 
examined whether any individual flies had a significant yaw velocity 
response to any cue position (Extended Data Fig. 1d). Because individual 
flies might be right- or left-handed™, we did not average data from 
right and left cue positions in this analysis; thus there were 35 cue posi- 
tions. For each fly, we computed trial-averaged yaw velocity for each 
of 2-8 open-loop epochs, and we created a matrix containing all cue 
positions for every epoch in the dataset of that fly. We then randomly 
drewanumber of values (with replacement) from the matrix (number 
of epochs x 35 cue positions) to match the number of epochs that we 
recorded for that fly. This procedure was randomized with respect to 
cue position and epoch number. For each fly, a bootstrap distribution 
was obtained by repeating this procedure 100,000 times, each time 
calculating the mean of the drawn values. The difference between 
the observed trial-averaged yaw responses for each cue position and 
the mean of the bootstrap distribution was used to obtain a P value 
(two-sided). In this manner, a P value was calculated for every fly at 
every cue position (73 x 35 Pvalues). The statistical significance of each 
trial-averaged yaw was assessed for each fly and each position at with 
a= 0.05 using the Bonferroni-Holm method to correct for multiple 
comparisons. No tests showed a statistically significant yaw velocity 
for any individual fly at any cue position. 


Correlations between visual receptive fields and heading turning 
curves in E-PG neurons. In Fig. 2c, d, heading tuning curves and visual 
receptive fields were smoothed using a median filter with a width of two 
cue positions. Correlation coefficients were computed on smoothed 
curves (40 pairs in total). In Fig. 2d, as a control, we randomly drew 
(with replacement) 40 heading tuning curves and 40 visual response 


curves, yielding 40 correlation coefficients. The mean of that correla- 
tion value was then recorded. This process was repeated 10,000,000 
times to build a bootstrap distribution and the 95% confidence interval 
of this distribution was computed. 


Visual receptive fields of R neurons. In Fig. 3a, spikes were detected 
after low-pass filtering the recorded current at 1 kHz by identifying 
deflections greater than 15 pA that occurred outside a 0.5-ms refractory 
period. The spike rate was measured over the 500-ms visual stimulus 
period. 


Responses of E-PG neurons to optogenetic stimulation of R neu- 
rons. In Fig. 3b, peak hyperpolarization was calculated as the trial- 
averaged voltage during a1-s baseline period minus the minimum trial- 
averaged voltage reached in the 1s following the 5-ms optogenetic 
stimulus. Four optogenetic stimulus trials were recorded per cell. 


E-PG ensemble representations of heading direction. In Fig. 4, rigid 
motioncorrectioninthex, yand zaxes was performed for the volumet- 
ric imaging stacks for every epoch using the NoRMCorre algorithm”. 
This algorithm performs piece-wise rigid registration of small overlap- 
ping sectors within the field of view, and then merges the sectors via 
interpolation, allowing approximate cancellation of non-rigid brain 
movement artefacts. Motion correction was parallelized on a high- 
performance computing cluster. For each epoch, we defined 16 regions 
of interest, corresponding to the 16 glomeruliin the protocerebral 
bridge; each region of interest was defined in onez plane. To calculate 
the time-dependent change in fluorescence (AF/F) for each glomerulus, 
we used a baseline fluorescence (F) defined as the mean of the lowest 
5% of raw fluorescence values across the entire experiment for that 
glomerulus. We excluded from the baseline the rare frames that were 
lost as a result of the rigid motion correction algorithm. The singular 
bump ofactivity in E-PG dendrites within the ellipsoid body' translates 
into two bumpsin the protocerebral bridge**; these two bumps move 
together, so that the signal has a spatial period of eight glomeruliinthe 
protocerebral bridge. Therefore, to calculate the neural representation 
of heading direction, we took the spatial Fourier transform of AF/Fin 
the protocerebral bridge across all 16 glomeruli. We used the phase of 
the Fourier component at eight glomeruli as the phase of the neural 
representation of heading for each time point; this procedure was 
described previously*. We used the sign convention in which a positive 
change in phase corresponds to a rightward movement of the bumps 
inthe protocerebral bridge, and aclockwise movement of the bumpin 
the ellipsoid body (when viewed from the posterior side of the brain). 
For display purposes only, in Fig. 4a, we averaged the AF/Fsignals from 
the right and left half of the protocerebral bridge (which is why only 
one bump is visible); this averaging was not performed as part of the 
data analyses described above. 


Offset of the E-PG ensemble reference frame. In Fig. 4c, d, to 
calculate the offset of the reference frame (the difference between 
the fly’s heading and the neural representation of heading), we first 
downsampled the behavioural data to match the volumetric imaging 
rate (6-9 Hz). We removed time points in which the FicTrac analogue 
signals were problematic or when the power of the Fourier transform 
was below a specified threshold (0.1). We also excluded the first 3 s of 
each 2-min closed-loop epoch due to a delay between imaging trigger 
and the start of the visual stimulus. We then took the angular position 
of the visual panorama from the analogue voltage output of the LED 
panel system (positive defined as to the right of the fly, or clockwise 
when viewed from above the set-up). We calculated the offset of the 
E-PG ensemble reference frame as the negative of the spatial Fourier 
transform phase minus the position of the visual panorama. This value 
is consistent with previously published methods? to calculate the offset 
between the bump position in the protocerebral bridge and the ball yaw 
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position. To quantify offset probability (Fig. 4d, fand Extended Data 
Fig 7), we analysed the final 4 min of each pre-training block, the full 
20 min of each training block, and the first 4 min of each post-training 
block. 


Effect of training on visual receptive fields. In Fig. 5 (and Extended 
Data Figs. 8, 9), heading tuning curves and visual receptive fields were 
smoothed using a median filter with a width of two samples (cue posi- 
tions). The last pre-training open-loop epoch (probe 1) and the first 
post-training open-loop epoch (probe 2) were used for the following 
analyses. In Fig. 5d-f, the absolute change was obtained by subtracting 
the two visual receptive fields (post-training minus pre-training), then 
summing the absolute value of the difference (A) over all cue positions, 
and finally dividing by the number of positions. In Fig. 5d, the change 
in receptive field shape was obtained by cross-correlating probe land 
probe 2 and calculating 1 — R’). In Fig. 5f, the modulation by heading 
during turning was taken as the difference between the maximum and 
the minimum of the heading direction turning curve. 


Controls for training. In Fig. Se, to estimate the drift in visual receptive 
field under control conditions, we had flies navigate ina one-cue world 
(rather than a two-cue world) during the waiting period between the 
open-loop epochs. In some cases (matched control in Extended Data 
Fig. 9), flies received exactly the same protocol as the experimental 
condition except with one-cue closed-loop during the training period; 
in other words, these matched controls received 12 consecutive min- 
utes of one-cue (rather than two-cue) closed-loop epochs during the 
‘training’ period. For other controls (control in Extended Data Fig. 9), 
we identified experiments from Fig. 2 in which the recording had lasted 
long enough for us to present four open-loop epochs interleaved with 
four one-cue closed-loop epochs. In these recordings, the second and 
fourth epochs were separated by more than 12 min (typically around 
15 min) and sothey are appropriate controls for the training protocol. 
We therefore treated the second and fourth open-loop epochs as if 
they were ‘probe 1’ and ‘probe 2’ epochs in a training experiment, and 
we analysed them as described above for the true training experiments. 
The important distinction is that this second group of control flies 
experienced one-cue rather than two-cue closed-loop epochs during 
the window between probe 1 and probe 2. 


Data inclusion 

Weinclude epochs for Figs. 1, 2 ifthe cell was healthy; specifically, this 
meant that the epoch-averaged voltage was below —33 mV and within 
15 mV of the voltage observed at the start of the first epoch of the experi- 
ment, and if the spike amplitude was more than 50% of the amplitude 
observed in the first epoch. Closed-loop epochs were included if the fly 
visited all heading directions during that epoch. Cells were included if 
>2 open-loop epochs met these criteria; in Fig. 2 we also required that 
>2 closed-loop epochs met these criteria. In Fig. 3, cells were included 
if>2 open-loop epochs met our cell health criteria. A single recording 
from the UAS/+ control genotype was excluded because the biocytin 
fill showed that it was not an E-PG neuron. All other biocytin-filled 
neurons analysed during this project (that is, 65 out of 66 neurons) were 
confirmed to be E-PG neurons. All recordings that were not imaged 
post hoc were therefore assumed to target E-PG cells. We excluded 5 
out of 24 flies in Fig. 4 owing to either weak fluorescence or an unstable 
offset between the angle of the E-PG bump and the fly’s heading angle 
at the end of the initial closed-loop one-cue epoch. In Fig. 5, cells were 
included if the epoch-averaged voltage from all epochs of the experi- 
ment (pre-training, training, post-training) was below —33 mV and if 
the fly visited all heading directions during the two epochs (8 min) of 
one-cue closed-loop before training and during the final two epochs 
(8 min) of two-cue closed-loop training. We required that the fly’s 
mean yaw Velocity was >20°/s during the final 2 epochs of the two- 
cue closed-loop training; 10 cells were excluded due to this restriction. 


We also removed recordings in which the visual receptive field and/or 
heading turning curve were almost flat during the pre-training period 
(max — min <2 mV); six cells were removed due to this restriction. 

On occasion, during E-PG neuron electrophysiological recordings, 
we observed unexpected large inhibitory postsynaptic potentials with 
a stereotyped sharp onset, a large amplitude (>15 mV) and a stereo- 
typed time course. They were followed by a prolonged period of depo- 
larization when the variance of the voltage trace was also diminished. 
These events interfered with visual and heading tuning measurements; 
therefore, for Figs. 1-3, any epochin which such an event occurred was 
excluded from the analysis. For Fig. 5, the event was clipped but the rest 
of the epoch was used; 5% of open loop epochs and 10% of closed-loop 
epochs were clipped in this manner. 

Analysis was performed using MATLAB R2016b, R2017a and R2017b 
(MathWorks). 


Determination of sample sizes 

For genetic perturbation experiments (Fig. 3c), the number of experi- 
ments performed was determined by first collecting a pilot dataset of 
(n=4 for the three genotypes using the R20A02-Gal4 driver line). On 
the basis of the initial effect size, power analysis was used to determine 
the number of experiments needed to test the hypothesis that visually 
evoked hyperpolarization was smaller in the experimental genotype. 
For all other experiments, sample sizes were chosen based on standard 
sample sizes in the field. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The datasets generated during and/or analysed during the current study 
are available from the corresponding author on reasonable request. 


Code availability 


Analysis code is available at https://github.com/wilson-lab/FisherLu- 
DAlessandroWilson_AnalysisCode. 
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Extended Data Fig. 1| Measuring behaviour and E-PG visual responses. 

a, Side view of a fly walking on an air-cushioned ball during an 
electrophysiology experiment. b, Image of the ball and plastic holder. Air flows 
up through the holder and out the semi-spherical depression that cradles the 
ball.c, Schematic of the experimental set-up viewed from above. The fly is 
secured in an aperture in the centre of a horizontal platform. The platformis 
surrounded by acircular panorama. The panorama is composed of square LED 
arrays*? (2 squares vertically x 12 squares horizontally). The ball is illuminated 
by an infrared (IR) LED, whichis visible as ared spotinb. Acamera captures an 
image of the ball to enable tracking using FicTrac**. Inset shows FicTrac view. 
Camera and infrared LED are not drawn toscale.d, The yaw velocity of the fly 
compared tothe cue position. This is the dataset that is the basis for Fig. 1f, but 
here broken down into averages for each individual fly, and with right (+) and 
left (-) cue positions kept separate. Positive velocities are right turns, and 
negative velocities are left turns. No tests showed a statistically significant yaw 
velocity (P< 0.05, two-sided comparison to bootstrap distribution) for any 
individual fly at any cue position. For details of analysis, see Methods, ‘Yaw 
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during open-loop epochs’. e, Yaw velocity in response to the visual cue 
presentation. This analysis is the same as that shown in Fig. 1f, but here yaw 
velocity is plotted against the distance of the cue jump between consecutive 
trials. As in Fig. 1f, we show mean (black) +1s.d. (grey) across experiments (73 
experiments in 68 flies). Magenta lines show the bootstrapped 95% confidence 
interval of the meanacross flies after randomizing cue positions, Bonferroni- 
corrected for multiple comparisons. Because the mean lies within these 
bounds, it is not significantly different from random. This analysis further 
supports the conclusion that there is no systematic yaw response to the 
random flashes of the vertical bar. For details of analysis, see Methods, ‘Yaw 
during open-loop epochs’. f. The visual receptive field of an example cell 
measured multiple times over the course of a40-min recording. Each row 
shows data froma separate visual mapping epoch. Data from this example cell 
are also shown in Fig. le. Note the stability of the visual receptive field over this 
time period. For experiments shown in this figure, we used UAS-mCD8::GFP/ 
UAS-mCD8::GFP; R6ODOS-Gal4/R60DO0S-Gal4 flies. 
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Extended Data Fig. 2| Visually evoked hyperpolarization and 
depolarization, during and after cue presentation. a, Example voltage 
responses of the same E-PGneurontotwo cue positions. Dashed lines indicate 
the mean baseline voltage before the cue. This neuron is hyperpolarized by the 
cue at 90° and depolarized by the cue at -97°. Note that hyperpolarization 
decays more rapidly than depolarization. Inb, to quantify visual receptive 
fields, we measured the change in voltage during cue presentation and after 
cue removal in the 250-ms windows marked ina with brackets, in both cases 
relative to baseline. b, Summary of E-PG visual receptive fields measured 
during cue presentation. Cells are sorted by the cue position that evokes 
maximal hyperpolarization. The histogram shows the number of E-PG neurons 
with maximal hyperpolarization at each cue position (73 E-PG neurons in 68 
flies).c, Summary of E-PG visual receptive fields measured after cue removal. 
Cell order is the same as inb. Note that hyperpolarizing responses tend to 
decay, whereas depolarizing responses tend to persist; this is consistent with 
the hypothesis that the hyperpolarization during cue presentation is due to 


cue position for peak depolarization 
or minimum hyperpolarization (°) 

direct synaptic inhibition from R neurons, whereas depolarization is 
polysynaptic and caused by withdrawal of tonic synaptic inhibition. The 
histogram shows the number of E-PG neurons with maximal hyperpolarization 
after cue removal for each cue position. d, Same as b, but sorted by the cue 
position that evoked maximal depolarization (minimal hyperpolarization), as 
in Fig. 1g.e, Sameasc, but withthe cell order as ind. f, Summed response across 
all neurons measured during (left) and after (right) the cue. The left curve has a 
pair of minima around +100°; this bias is probably inherited from Rneuron 
receptive fields, which are biased towards positions offset from the visual 
midline>. By contrast, the right curve is relatively flat. g, Visual cue position 
eliciting maximal depolarization (minimum hyperpolarization), plotted versus 
E-PG neuron location, for the 21 recorded E-PG neurons that were filled. No 
signification correlation was observed (circular correlation coefficient =—0.15, 
P=0.49)*. For experiments shown in this figure, we used UAS-mCD8::GFP/UAS- 
mCD8::GFP; R60DOS-Gal4/R60DO0S-Gal4 flies. 
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Extended Data Fig. 3| E-PG neuron pairs recorded sequentially from the 
same brain. a, Two biocytin filled dendrites (green) from sequentially 
recorded E-PG neurons that innervate adjacent wedges within the ellipsoid 
body. Neuropil reference marker is shown in grey (anti-nc82 antibody). Images 
are maximum intensity z-projections. Scale bar, 10 pm. The schematic shows 
the approximate position of ellipsoid body and E-PG dendrites froma coronal 
view of the fly brain. b, c. Heading tuning (red, measured in VR) and visual 
receptive field (blue, measured with random flashes) from sequentially 
recorded E-PG pairs from two example flies. Dendritic locations of the 
recorded neuronsare green in the ellipsoid body schematic above each set of 
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plots. In both cases, by chance, the two dendrites were physically adjacent. In 
both cases, adjacent E-PG neurons from the same fly exhibited similar visual 
receptive fields and heading tuning curves, supporting the conclusion that 
adjacent E-PG cells typically receive inhibition from adjacent regions of visual 
space and represent adjacent heading directions. Comparing the visual 
receptive field and the heading tuning curve for each neuron yielded 
correlation coefficients (Pearson’s) of 0.76 (fly [neuron 1), 0.90 (fly 1neuron 2), 
0.95 (fly 2 neuron 1) and 0.65 (fly 2 neuron 2). For experiments shown in this 
figure, we used UAS-mCD8::GFP/UAS-mCD8::GFP; R6ODOS-Gal4/R60DOS5-Gal4 
flies. 
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Extended Data Fig. 4 | Visual receptive fields and heading tuning of E-PG 


neurons. Heading tuning (red, closed-loop mode) and visual receptive fields 
(blue, open-loop mode) for all 40 recorded E-PG neurons (from 39 flies). For 


each neuron, the correlation coefficient (Pearson’s) is reported for the 


comparison between the visual receptive field and the heading tuning curve. 
Asterisks denote data also shown in Fig. 2. For experiments shown in this 
figure, we used UAS-mCD8::GFP/UAS-mCD8::GFP; R60D0OS5-Gal4/R60DO0S5-Gal4 


flies. 
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Extended Data Fig. 5 | R neurons types labelled by R20A02-Gal4 and R54E12- 


Gal4 described by MCFO. a, Observed numbers of R neurons belonging to 
each type froma dataset of n=78 single-neuron MCFO clones* from the 
R20A02-Gal4 ine. Rneuron types were classified according previously 
published methods°. b. Same as ina but for the R54F12-Gal4 line (n= 61 single- 
neuron MCFO clones). c-h, Examples of single Rneuron MCFO clones. Images 
are maximum intensity z-projections. Background labelling was manually 


removed toimprove clarity of specific neuronal morphologies. i, Multiple 
Rneuron MCFO clones labelled in different colours using the R20A02-Gal4 line. 
Image is amaximum-intensity z-projection. Scale bars, 20 pm. For experiments 
shown in this figure, we used R57C10-FLPGS.PEST; UAS(FRT.stop)myr::smGdP- 
HA, UAS(FRT.stop)myr::smGdP-V5, UAS(FRT.stop)myr::smGdP-Flag/R2OAQ2- 
Gal4, RS7C10-FLPGS.PEST; UAS(FRT.stop)myr::smGdP-HA, UAS(FRT.stop) 
myr::smGdP-V5, UAS(FRT.stop)myr::smGdP-Flag/R54E12-Gal4 flies. 
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Extended Data Fig. 6 | Suppressing R neuron activity with two independent 
driver lines reduces visually evoked hyperpolarization in E-PG neurons. 

a, Same as Fig. 3c, except instead of measuring peak visually evoked 
hyperpolarization, we measured mean visually evoked hyperpolarization (by 
zeroing all non-negative visual responses and then averaging visual responses 
across all cue positions). From left to right: n=8, 10,12, 10, 9. Both Kir2.1 means 
are significantly different from corresponding genetic controls using two- 
sided Wilcoxon rank-sum tests. R20A02 Kir2.1 versus R20A02/+ and UAS/+ 
(P=0.0013 and P=0.0003, respectively), R54£12 Kir2.1 versus R54F12/+ and 
UAS/+ (P=0.005 and P= 0.0025, respectively). b, Rneuron population labelled 
by Kir2.1::eGFP. Images are maximum intensity z-projections.c, Numbers ofR 
neurons per hemisphere expressing Kir2.1::eGFP in each experimental 
genotype, n=9 (R20A02) and n=11 (R54E12) (horizontal lines are means). On 
the basis of the previously reported total number of R neurons of each type® 
and our MCFO quantification of the R neuron types labelled by R20A02-Gal4 


and R54F12-Gal4 (Extended Data Fig. 5), these cell counts suggest that R2O0AQ2- 


Gal4 targets approximately 20% of R2,30% of R4m and all R4d neurons. These 
counts suggest that R54E12-Gal4 targets approximately 40% of R2 neurons and 
allR4m and R4d neurons. This incomplete targeting of outer Rneurons may 
provide one explanation for the remaining visually evoked inhibition observed 
in some recordings (Fig. 3). Note that although both driver lines label other 
neurons inthe central brain and visual system, R neurons appear to be the only 
cell type that is labelled by both lines. In the visual system, the driver line 
R20A02-Gal4 targets one medulla intrinsic neuron, probably Mi12, and one cell 
type that arborizes in around layers 4-6 of the lobula, whereas the driver line 
R54E12-Gal4 appears to target the medulla neuron Tm3. For experiments 
shown in this figure, we used +/w; R6ODOS-LexA/LexAop-mCD8::GFP; +/UAS-Kir2 
(UAS-only control); +/w; R60D05-LexA/LexAop-mCD8::GFP; R20A02-Gal4/+ 
(R20A02 Gal4-only control); +/w; R60D05-LexA/LexAop-mCD8.:GFP; RS4E12- 
Gal4/+ (R54E12 Gal4-only control); +/w; R6ODOS-LexA/LexAop-mCD8::GFP; 
R20A02-Gal4/UAS-Kir2.1 (R20A02 Kir2.1); and +/w; R6ODOS-LexA/LexAop- 
mCD8::GFP; RS4E12-Gal4/UAS-Kir2.1 (R54E12 Kir2.1) flies. 
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Extended Data Fig. 7 | Offset probability histograms in training 
experiments. Offset probability histograms during each segment of the 


training experiments shown in Fig. 4, for all 19 GCaMP imaging experiments (in 
19 flies). As in Fig. 4, the circular mean during the pre-training period is defined 


pre-training (1 cue) 
training (2 cues) 
post-training (1 cue) 


Oren 


as offset, (here marked with an arrowhead), and for display purposes we 
horizontally aligned all of the offset, values in different flies. Asterisks mark 
data shown in Fig. 4. For experiments shown in this figure, we used +/w; UAS- 
GCaMP6f/+; R60DOS-Gal4/+ flies. 
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The red dashed curves are the change in heading tuning (training minus pre- 
training). Blue curves are visual receptive fields. The blue dashed curve is the 
change in the visual receptive field (second probe minus first probe). Seven 
neurons from this dataset are also shown in Figs. 1, 2. 


Extended Data Fig. 8 | Heading tuning and visual receptive field 
measurements in training experiments. Heading tuning curves and visual 
receptive fields for all additional 17 E-PG neurons (from17 flies) from the 
training experiments in Fig. 5. As in Fig. 5, red solid curves are heading tuning. 
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Extended Data Fig. 9 | Controls for remapping experiments. a, Data receptive fields from control cells. Blue dashed curve is the change in visual 
reproduced from Fig. 5e. Absolute change in visual receptive fields. Control receptive field (second probe minus first probe) over the control period. 
flies navigated in a one-cue world (rather than a two-cue world) during the Typically, visual receptive fields were stable over time under control conditions 
waiting period between the open-loop epochs used to compute the changein (control neurons 2 and 3). On occasion, we observed spontaneous changes in 
visual responses. In some cases (matched control), flies received exactly the the visual receptive field of an E-PG neuron during the control period (for 
same protocol as the experimental condition except with one-cue closed-loop example, control neuron 1), although these changes were not as large as the 
epochs during the training period; in other words, these matched controls changes that we observed in many neurons in trained flies (see a). c, Heading 
received 12 consecutive minutes of one-cue (rather than two-cue) closed-loop tuning inthe same three control cells. Note how the spontaneous changes in 
epochs during the training period. In all other cases (control), flies received visual receptive fields seen in neuron 1 are accompanied by changes in heading 
4-min blocks of one-cue closed-loop epochs interleaved with 150-s open-loop tuning. 
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Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


ie) A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


CO) Uo 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Matlab 2016a, Matlab 2017a, Matlab 2017b, Scanlmage 2017, Fiji (https://fiji.sc/), FicTrac (http://rjdmoore.net/fictrac/) 


Data analysis All analyses of calcium imaging and electrophysiology data were performed using custom code written in Matlab 2017b 
(electrophysiology) and Matlab 2016b (calcium imaging) (see Methods for full description of analysis, code is available at https:// 
github.com/wilson-lab/FisherLuDAlessandroWilson_AnalysisCode). Confocal images were analyzed using Fiji (ImageJ) and cell body 
counting was performed using the Fiji Cell Counter plugin (Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. 
Nat Methods 9, 676-682, (2012).) 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 


All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 
- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 
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For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size For genetic perturbation experiments (Fig. 3c), the number of experiments performed was determined by first collecting a pilot data set (n=4 
for each of the 3 genotypes using the R20A02-Gal4 driver line). Based on the initial effect size, power analysis was used to determine the 
number of experiments needed to test the hypothesis that visually-evoked inhibition was smaller in the experimental genotype. For all other 
experiments, sample sizes were chosen based on standard sample sizes in the field. 
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Data exclusions — Figs. 1 & 2: Epochs were included if the cell was healthy; specifically, this meant that the epoch-averaged voltage was below -33 mV and 
within 15 mV of the voltage observed at the start of the first epoch of the experiment, and also if the spike amplitude was >50% of the 
amplitude observed in the first epoch. Closed-loop epochs were included if the fly visited all heading directions during that epoch. Cells were 
included if =2 open-loop epochs met these criteria; in Fig. 2 we also required that >2 closed-loop epochs met these criteria. 

Fig. 3: Cells were included if >2 open-loop epochs met our cell health criteria. A single recording from the UAS/+ control genotype was 
excluded because the biocytin fill showed that it was not an E-PG neuron. 

Fig. 4: 5/24 flies were excluded due to either weak fluorescence or an unstable offset between the angle of the E-PG bump and the fly’s 
heading angle at the end of the initial closed-loop 1-cue epoch. 

Fig. 5: Cells were included if the epoch-averaged voltage from all epochs of the experiment (pre-training, training, post-training) was <-33 mV, 
and if the fly visited all heading directions during the 2 epochs (8 min) of 1-cue closed-loop prior to training and during the final 2 epochs (8 
min) of 2-cue closed-loop training. We required that the fly’s mean yaw velocity was >20°/s during the final 2 epochs of the 2-cue closed-loop 
training; 10 cells were excluded due to this restriction. We also removed recordings where the visual and/or heading turning curves were 
almost flat during the pre-training period (max-min <2mV); 6 cells were removed due to this restriction. 


On occasion, during E-PG neuron electrophysiological recordings, we observed unexpected large inhibitory postsynaptic potentials with a 
stereotyped sharp onset, a large amplitude (>15mV), and a stereotyped time course. They were followed by a prolonged period of 
depolarization when the variance of the voltage trace was also diminished. These events interfered with visual and heading tuning 
measurements, and so for Figs. 1-3, any epoch where such an event occurred was excluded from the analysis. For Fig. 5, the event was 
clipped but the rest of the epoch was used; 5% of open loop epochs and 10% of closed-loop epochs were clipped in this manner. 


Replication For genetic perturbation experiments (Fig 3c) we reproduced the effect of R neuron silencing with two independent driver lines (R20A02 and 
R54E12). For all other experiments, results were replicated in different individuals within each data set. 


Randomization — For genetic perturbation experiments (Fig 3c) flies were grouped based on genotype (neuronal activity manipulated by Kir2.1 expression vs 
control genotypes). For comparison between trained and control flies (Fig 5) the experimental protocol that would be performed (control vs 
training) was always decided on prior to starting the experiment. No other randomization was performed. 


Blinding The experimenter was not blind to genotype except when genetic perturbations were used: Figure 3c (Kir2.1 perturbation). For the Figure 3c 
data set collected for driver line R20A02-Gal4 the experimenter was blind to genotype after the pilot phase; because Fig. 3c pilot data were 
indistinguishable from subsequent data, all data were ultimately pooled, and overall the experimenter was blind to genotype in 67% of these 
recordings. For the data set obtained using the driver line RS4E12-Gal4, the experimenter was not blind to genotype because the 
experimental genotype was obtained at a lower-than expected (sub-Mendelian) frequency, making it impractical to blind the experimenter. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


[| Human research participants 


[| Clinical data 


Antibodies 


Antibodies used -rat anti-FLAG (Novus Biologicals), Cat#: NBP1-06712B, RRID: AB_10006034, clone#: L5, lot#: B-3 
-rabbit anti-HA (Cell Signaling Technologies), Cat#: 3724 RRID: AB_1549585, clone#: C29F4, lot#: 9 
-DyLight 550-conjugated mouse anti-V5 (Bio-Rad), Cat#: MCA1360D550GA, RRID: AB_2687576, clone#: SV5-Pk1 
-mouse anti-Bruchpilot antibody (Developmental Studies Hybridoma Bank, nc82), Cat#: nc82, RRID: AB_2314866 
-chicken anti-GFP (Abcam), Cat#: ab13970, RRID: AB_300798 


Validation Multi-colorFlip Out (MCFO) immunohistochemistry: 
Multi-colorFlip Out (MCFO) genetic strategy (Nern et al. 2015) uses expression of epitopes (HA, FLAG and V5) that are not 
endogenous to the fly genome. For MCFO immunostaining in our study we followed the exact protocol as established and 
validated in Drosophila by Nern et al. that uses anti-HA, anti-FLAG and anti-V5 antibodies (Nern et al. 2015). These antibodies 
have also each been validated prior to Nern et al: 


rat anti-FLAG: Manufacturer notes confirms that rat anti-FLAG (Cat#: NBP1-06712B) has also been validated as FLAG-Tag specific 
in Drosophila (PMID: 26573957). 
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rabbit anti-HA: Manufacturer confirmed rabbit anti-HA antibody has Epitope tag specificity using western blot and 
immunohistochemical analysis comparing untransfected with HA-tag transfected COS cells (https://www.cellsignal.com/ 
products/primary-antibodies/ha-tag-c29f4-rabbit-mab/3724#Vvalidation-data). 


DyLight 550-conjugated mouse anti-V5: Manufacturer notes confirm that the DyLight 550-conjugated-Mouse anti V5-Tag, clone 
SV5-Pk1 recognizes the sequence, IPNPLLGLD, present on the P/V proteins of the paramyxovirus, SV5 (Dunn et al.1999) and can 
be used to detect recombinant proteins labeled with this V5-tag (Randall et al.1993 and Zhao et al. 2005). 


Other immunohistochemistry looking at neuron anatomy: 


The anti-Bruchpilot antibody (DSHB) is the standard in the Drosophila field as a background stain that labels presynaptic active 
zones to provide neurophil labeling for analysis of anatomy. This antibody was originally validated for use in Drosophila to label 
presynaptic active zones using immunohistochemistry and to be specific to Bruchpilot protein (Wagh et al. 2006). 


The anti-GFP antibody (Adcam) is the standard antibody used in the field for labeling exogenous expression of Green Fluorescent 
Protein (GFP) in Drosophila, note that this protein is not endogenously expressed in the Drosophila genome. Manufacturer's 
datasheet confirm that this anti-GFP antibody has been validated using western blot and immunohistochemistry to have 
specificity for Green Fluorescent Protein. Manufacturer also confirms the use of this antibody for immunolabeling of GFP in 
Drosophila across 121 peer-reviewer manuscripts (e.g. Sykes et al. 2005 PMID: 16122730). 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Below is a complete description from the methods sections of all transgenic Drosophila strains use in this study included how 
they were obtained and their citation: 


The following Gal4 lines were obtained from the Bloomington Drosophila Stock Center (BDSC) and are described in ref. 40: 
R6OD05-Gal4}attP2, P{R60D05-lexA}attP40, P{R19CO8-lexA}attP40, P{R12B01-Gal4}attP2, P{R54E12-Gal4}attP2, P{R20A02- 
al4}attP2, P{GawB}EB1 was obtained from the BDSC and is described in Wang et al. 2002 

20XUAS-IVS-mCD8::GFP}attP40 was a gift from Barret Pfeiffer and Gerry Rubin and is described in Pfeiffer et al, 2010 
13XLexAop2-mCD8::GFP}attP40 was obtained from the BDSC and is described in Pfeiffer et al, 2010 PBac{13xLexAop2-|VS- 
n21-Chrimson::tdT-3.1}VKO0005 was a gift from Barret Pfeiffer and David Anderson and is described in Hoopfer et al. 2013 
20X-UAS-CsChrimson-tdTomato}VK00005 was a gift from John Tuthill who obtained it from Barret Pfeiffer. P{UAS-Hsap 
KCNJ2.EGFP}7 was obtained from the BDSC and is described in Hardie et al. 2001 P{UAS-GCamp6f}attP40 was obtained from the 
DSC via Thomas Clandinin and is described in Chen et al. 2013 

ransgenes for MultiColor FlpOut were obtained from the BDSC and are described in Nern et al. 2015 these are w[1118] 
y[+t7.7] w[+mC]=GMR57C10-FLPGS.PEST}su(Hw)attP8; PBac{y[+mDint2], and w[+mC]=10xUAS(FRT.stop)myr::smGdP- 
HA}VKO0005 P{y[+t7.7] , and w[+mC]=10xUAS(FRT.stop)myr::smGdP-V5-THS-10xUAS(FRT.stop)myr::smGdP-FLAG}su(Hw)attP1. 
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Wild animals o wild animals were used in this study. 
Field-collected samples o field samples were collected for this study. 
Ethics oversight o ethical approval was required because all experiments in this study were performed on Drosophila melanogaster. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Many animals rely onan internal heading representation when navigating in varied 
environments’ °. How this representation is linked to the sensory cues that define 


different surroundings is unclear. In the fly brain, heading is represented by ‘compass’ 
neurons that innervate a ring-shaped structure knownas the ellipsoid body*""”. Each 
compass neuron receives inputs from ‘ring’ neurons that are selective for particular 
visual features’* *; this combination provides an ideal substrate for the extraction of 
directional information froma visual scene. Here we combine two-photon calcium 
imaging and optogenetics in tethered flying flies with circuit modelling, and show how 
the correlated activity of compass and visual neurons drives plasticity” ~, which flexibly 
transforms two-dimensional visual cues into a stable heading representation. We also 
describe how this plasticity enables the fly to convert a partial heading representation, 
established from orienting within part of a novel setting, into acomplete heading 
representation. Our results provide mechanistic insight into the memory-related 
computations that are essential for flexible navigation in varied surroundings. 


Internal representations of the spatial relationship of an animal to its 
surroundings are essential for flexible navigation**”. Although these 
representations must be stable to be useful for planning and goal- 
oriented behaviour, they must also adapt to changes in environmental 
and behavioural contexts. Indeed, the representations provided by head- 
direction cells, grid cells and place cells are all known to remap in different 
surroundings on the basis of spatially relevant sensory information” *®. 
Acentral question in navigation concerns howthe brain carries out this 
flexible transformation of sensory information into a stable internal 
representation”. In insects, a multifunctional brain region known as 
the central complex" (Fig. 1a) has a key role in visually guided navigation, 
including flexible heading selection’®”’ and place learning’. Many of 
these abilities rely on successfully incorporating visual information from 
landscapes” or the pattern of polarized light and chromatic gradients 
in the sky*>*" to generate an internal representation of heading in the 
central complex; specifically, a bump of activity in compass neurons 
(also known as E-PG neurons; see ‘Nomenclature’ in Methods) in the 
ellipsoid body’, a substructure of the central complex (Fig. 1a, b). These 
neurons are an important part ofa ring attractor network” that maintains 
and updates the heading representation on the basis of self-motion?>* 
and visual signals?. Visual inputs are brought to the ellipsoid body by 
GABAergic (y-aminobutyric-acid-releasing) ring neurons”, which have 
localized spatiotemporal receptive fields” *° (Fig. 1c). Here we show how 
network plasticity enables the flexible generation of a stable compass- 
neuron heading representation in different visual scenes. 


Remapping of heading representation 


To explore the flexibility of heading representation in the fly, we 
used two-photon calcium imaging to monitor responses of the 


compass-neuron population in head-fixed flies that were flying ina 
virtual-reality arena, which consisted of panels of light-emitting diodes. 
The virtual-reality setup gave the insect one-dimensional, closed-loop 
control of its orientation” relative to visual scenes (Fig. 1d-g, Methods). 
Visual environments were derived from two natural scenes (Fig. 1h, i). 
The compass-neuron response in these scenes rapidly stabilized into 
an activity bump in the ellipsoid body that maintained a consistent 
angular relationship to the visual scene as the fly turned (Fig. 1h, i). 
Previous studies**"° in simpler visual settings (such asa single stripe) 
have shown that the bump tracks the visual scene, but with an offset 
between the angular position of the bump in the ellipsoid body and 
the angular orientation of the stripe relative to the fly. This pinning 
offset between the bump and visual cues (Methods) seldom changes 
across trials for a given fly ina specific visual setting, but differs across 
flies?*?->*, We found that the pinning offset also varied substantially 
across different naturally derived scenes for a single fly, and across 
flies for the same scene (Fig. 1j). We argue that this variable but stable 
offset is the natural outcome of plasticity in synapses that flexibly maps 
visual scenes onto the heading representation. 

Ifactivity-dependent plasticity between visual inputs and compass 
neurons underlies the observed variability in offset (Fig. 1j), experi- 
encing an imposed artificial relationship between the scene and the 
bump should induce a sustained change in offset (as proposed for 
mammalian navigation systems’). A previous study of tethered flying 
flies used two-photon-localized optogenetics to temporarily displace 
acompass-neuron bump in the ellipsoid body by an arbitrary angle”. 
As in this previous study, here the original bump (Fig. 2a, d top) was 
quickly replaced by a displaced bump generated by focal optogenet- 
ics (Fig. 2b, Extended Data Fig. 1a). We then paired this artificial bump 
with an open-space scene (Fig. 1i) that was placed at a predetermined 
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Fig. 1|E-PG neurons stably represent heading in different visual 
environments. a, Central complex. Visual inputs to the ellipsoid body arrive 
from the optic lobe through the anterior optic tubercle to ring-neuron 
dendrites inthe bulb’. EB, ellipsoid body; PB, protocerebral bridge; BU, bulb; 
FB, fan-shaped body; NO, noduli; AOTU, anterior optic tubercle. b, Ring 
neurons (purple and green) project from the bulb to the entire circumference 
of the ellipsoid body. E-PG or compass neurons (solid grey arrows) innervate 
single ellipsoid-body wedges. Circuit details have previously been published. 
Dashed arrows, P-EN neurons (angular velocity). Small blobs in the ellipsoid 
body, synapses between ring and compass neurons. ¢c, Fictive sample receptive 
fields (red, excitatory; blue, inhibitory) of two ring neurons (purple and green 
in b) shownina flattened representation of the visual field (grey rectangle). 
The vertical stripe presented in the visual arena activates the greenring 
neuron. RF, receptive field. d, Imaging setup. IR, infrared; LED, light-emitting 
diode. e, Tethered flying fly. f, Ellipsoid body segmented into 16 regions of 


angular position in the arena relative to the bump (Fig. 2b, d middle, 
Supplementary Video 1). We repeatedly shifted the artificial bump 
through eight positions around the ellipsoid body, while simultane- 
ously shifting the scene around the visual arena to maintain its fixed 
angular relationship to the imposed bump (Fig. 2b, d middle). A5-min 
pairing protocol was sufficient to change the offset, and the newly 
imposed relationship between the visual scene and the compass-neuron 
bump was clearly preserved in subsequent closed-loop probe trials 
(Fig. 2c, d bottom, Extended Data Fig. 1f, h, |). Such remapping could 
also be induced with simpler visual scenes (such as a single stripe; 
Extended Data Fig. 1b, e, g, i, m), but could not be induced without 
the optogenetic reagent or in darkness (Extended Data Fig. lj, k,n, 0). 
Thus, we find strong experimental support for plasticity that enables 
visual surroundings to be flexibly remapped onto the compass-neuron 
population upon sustained experience of a specific angular relation- 
ship between the bump and the scene. 


Plasticity creates astable compass 


The experience-dependent remapping that we observed (Fig. 2a—d), 
which involves co-activation of specific visual inputs and compass 
neurons, is strongly suggestive of Hebbian plasticity, which has been 
hypothesized to explain how mammalian head-direction cells tether 


interest (ROIs). g, Population vector average (PVA) of AF/F, computed to obtain 
the angular position and amplitude of the compass-neuron activity bump. 

h, Compass-neuron calcium transients during closed-loop tethered flight ina 
visual environment derived froma natural scene (forest; shown at the top). 
Middle, actual scene presented onanarena of blue light-emitting diodes with 
discretized brightness. Snapshots of compass-neuron activity in the ellipsoid 
body at times ¢, and ¢,, corresponding to different scene orientations. Bottom, 
AF/F, of 16 ROIs over time. Greyscale band, PVA amplitude. Red line, scene 
orientation. GCaMP signal colour-coded in blue. Black line, PVA.i, Calcium 
transients from the fly inhina different scene (top), open space.j, Distribution 
of mean pinning offset across flies. Offset distribution for the open-space 
scene is significantly different from uniform for unknown reasons (open-space 
scene, 39 trials, 10 flies, unimodality test by randomization, P< 0.0001; forest 
scene, 40 trials, 10 flies, P=0.3603). 


to visual cues’”"*. We built an anatomically motivated circuit model to 


better understand the effect of sucha plasticity mechanism on scene- 
to-bump remapping. The key components of the model (Fig. 2e-h; 
implementation details are given in the Supplementary Information) 
are: (i) visual ring neurons that distribute information about visual 
features to all compass neurons throughout the ellipsoid body® © 
(Fig. 1b, c, 2f)—for simplicity, we treat ring-neuron receptive fields as 
encoding only azimuthal information, and address the two-dimen- 
sional spatiotemporal complexity of their responses" ina later section; 
(ii) ring attractor dynamics, a form of all-to-all competitive network 
dynamics that ensures a single compass-neuron bump that can remain 
active in darkness” ™; (iii) a plasticity rule through which the co-activa- 
tion of GABAergic inhibitory ring neurons and compass neurons results 
ina depression of the synaptic weight between them* (inhibitory Heb- 
bian plasticity” ~), whereas the activation of compass neurons alone 
results in potentiation (alternative plasticity rules are given in Supple- 
mentary Information). In this model (which shares some conceptual 
similarities with recent models of mammalian head-direction cells”° and 
grid cells”), the turns that the fly undertakes cause a retinotopic shift of 
the visual stimulus (which activates a different set of ring neurons), and 
angular velocity signals that are carried by so-called P-EN neurons**** 
(dotted lines in Fig. 1b) rotate the compass-neuron bump. For astable 
heading representation, bump positions driven by visual input and 
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Fig. 2| Manipulation of heading representation pinning offset. a—d, Activity 
snapshots of compass neurons before (a), during (b) and after (c) optogenetic 
manipulation in an open loop (imposed natural-scene orientations at the top, 
with vertical red lines emphasizing the relative orientations). Extended Data 
Figure 1a provides details of the optogenetic stimulation (opto-stim) protocol. 
a, Original pinning offset (arrowind, top, shows the time of this snapshot). 

b, Optogenetic imposition of new offset. b, Top left, bump imposed on left side 
of the ellipsoid body (below, red rectangle) when scene oriented as at the top. 
b, Top right, 45° counter-clockwise rotated scene and bump with offset as in 
left. b, Middle, sequence of optogenetically imposed ellipsoid-body offsets (d, 
middle) (Methods). b, Bottom, expanded view of same sequence as showninb 
(middle).c, After manipulation. The bump position relative to the same visual 
scene orientation as ina, shifted by offset imposed in b (compared, top and 
bottom). d, Compass neuron activity before (top), during (middle) and after 
(bottom) optogenetic manipulation (Supplementary Video 1). Arrowin the top 
panel corresponds toa; arrows inthe middle panel correspond to the left and 


angular velocity should be in register. That is, for any given heading, 
plasticity should ensure that inhibitory ring neurons create a position 
of decreased inhibition in the ellipsoid body that coincides with where 
the P-ENinput moves the bump-—essentially a self-consistent mapping 
of visual cues onto the bump. 

We first tested the model for a simple scene with a single vertical 
stripe (Extended Data Fig. 1b-e), simulating the fly turning through 
thescene (Fig. 2e-g, Supplementary Video 2; acomplex scene is shown 
in Extended Data Fig. 2a—c). These rotations ensured both that the 
bump travelled around the ellipsoid body and that ring neurons cor- 
responding to all visual-feature positions were selectively co-activated 
at appropriate angular orientations. Starting with random synaptic 
weights, Hebbian plasticity produced a spatially consistent mapping 
and stable offset between the heading representation and the angular 
position of the single visual feature (Fig. 2e). Simulating optogenetic 
manipulation asa current injection into model compass neurons repro- 
duced the remapping phenomenon (Fig. 2h, Extended Data Fig. 2d, e). 
These results account for the varying offsets observed across flies’, 
the persistence of an offset for a given scene ina single fly and the flex- 
ibility that allows the ellipsoid body to track heading within different 
visual scenes. 


Optogenetic inversion of the map 


In further simulations, the natural concurrence between scene move- 
ment and bump position during turns could be inverted, with visual 
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right panels of b (top); and arrowin the bottom panel corresponds toc.e, 
Simulation snapshots. Time-varying synaptic weights between ring and 
compass neurons (Extended Data Fig. 2). Simulation begins with random 
synaptic weights (left). Synapses between coactive ring and compass neurons 
are weakened. Synapses from inactive ring to active compass neurons are 
potentiated (see Supplementary Information for different plasticity rules). The 
weight matrix stabilizes over time (right) (Supplementary Video 2). Vertical 
purple rectangle, sample mapping from ring neuron 16 to all compass neurons. 
f, Simulated compass neurons when ring neuron 16 is active. g, Distribution of 
bump offsets across 500 simulations. h, Simulated optogenetic bump shift. 
Left, weight matrix before manipulation. Second and third from the left, anew 
map develops while the existing map weakens. Rightmost two panels, 
consolidation of the new map during a probe trial. Dashed red rectangle, initial 
synaptic weights from ring neuron 9 to compass neurons. Solid red rectangle, 
same weights after consolidation; offset shifted. 


cues overriding self-motion input to drive the bump backwards (Fig. 3a, 
b). In optogenetic offset-induction experiments, we found that the 
actual network was indeed flexible enough to induce an inverted remap- 
ping in which visual input drove the bump around the ellipsoid body 
in the opposite direction than would be expected (Fig. 3c, d, Supple- 
mentary Video 3). In the model, the inversion was eventually corrected 
after prolonged ring attractor dynamics driven by self-motion (Fig. 3b 
rightmost panel), but the short trial duration in our physiological 
experiments probably limited our ability to observe such a correc- 
tionin vivo. Thus, although self-motion exerts a strong influence over 
bump movement, network plasticity allows for a strong and notably 
flexible driving role for visual cues. 


Remapping after experiencing ambiguity 

Ring attractor dynamics ensures a single heading representation at any 
given time even for complex scenes, but under some circumstances 
this can be unstable‘. For example, a scene with two identical stripes at 
diagonally opposite locations (Extended Data Fig. 3a) makes orienta- 
tion within the scene inherently ambiguous’. Our model predicts that, 
upon prolonged exposure to this two-stripe scene, the plasticity mecha- 
nism creates a visual map with two potential offset angles. If a single- 
stripe scene is then presented, this results in two competing heading 
representations, with the ring attractor network selecting one of them 
at any particular time (Extended Data Fig. 3b, c). We found a similar 
effect experimentally in some probe trials after just 5 min of in vivo 
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Fig. 3 | Optogenetically imposed inverse mapping of visual scene onto 
compass neurons. a, Inverse mapping protocol, in which the stripe is angularly 
displaced opposite to the optogenetic bump displacement. b, Simulation of 
inverse mapping. Inverse mapping is complete after 864s, and maintained 
during initial period of probe trial (left panel under ‘probe’). Sustained angular 
velocity input eventually corrects the map in simulations (right panel under 
‘probe’).c, Segments (60s) of in vivo calcium transients before (top), during 
(middle) and after (bottom) a10-min manipulation. Before manipulation, the 
bump followed the direction of stripe motion (top). After manipulation, bump 
motion mirrors stripe motion but in the opposite angular direction (bottom) 
(Supplementary Video 3). d, Circular variance of bump offset during the probe 
trial, computed for the normal arrangement of ellipsoid-body ROIs (normal), 
and for the inverse arrangement of ellipsoid-body ROIs (inverse). Four out of 
eight flies tested showed asmaller circular variance for the inverted 
arrangement of ellipsoid-body ROIs (white dots), indicating that the map was 
indeed inverted. Poor bump tracking—resulting from incomplete map 
manipulation—was observed in one fly, resulting in intermediate circular 
variances for both maps (black solid dots). Grey solid dots, three flies 
maintained the correct map. 


closed-loop experience with a two-stripe scene in the absence of any 
optogenetic manipulation (Extended Data Fig. 3d-i, Supplementary 
Video 4). A companion study” to this Article finds electrophysiological 
and imaging signatures of offset switches in a larger fraction of experi- 
ments after walking flies experience such ambiguous scenes for longer 
durations. These results demonstrate how exposure to an ambiguous 
visual scene can, through the interactive influence of plasticity and 
ring attractor dynamics, affect the reliability of an otherwise-stable 
heading representation. 


Building a full map from partial views 


In our remapping experiments thus far, the fly performed multiple com- 
plete rotations to establish a stable heading representation in a novel 
setting, which seems unlikely under natural conditions. Drosophila 
can see nearly 320° of the visual scene from a single orientation** and 
the E-PG bump typically activates more than 90° of the ellipsoid body’; 
this suggests that even limited experience of a scene should trigger 
Hebbian plasticity that affects a large sector of the ellipsoid body. In 
the model, we found that full mapping of a visual scene could occur 
even if the bump was rotated only by 180° or less during optogenetic 
manipulation (Fig. 4a, b, Extended Data Fig. 4). We directly tested this 
prediction by imposing an angular relationship between a vertical 
stripe and an artificial compass-neuron bump, but this time limiting the 
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suffices to induce global remapping. a, Experimental protocol in which 
optogenetic manipulation and the experience of scene orientations span only 
180°. b, Simulation of protocol witha simple single-stripe scene. After 
manipulation (¢=840s under optogenetic stimulation), there are two sets of 
weak synapses (top left and bottom left), and the upper right corner of the 
weight matrix is completely erased. During the probe trial, anewly imposed 
offset propagated across the entire weight matrix (probe). c, Segments (60 s) 
of compass-population calcium transients before (top), during (middle) and 
after (bottom) optogenetic manipulation, spanning 180° of the ellipsoid body 
and using a naturalistic scene as oriented ina. Compare the offsets in the top 
and bottom panels. d, Distribution of absolute offset shift across flies. Left, 
baseline before manipulation. Right, offset shift by manipulation (10 flies, two- 
sided bootstrap test of mean difference, *P=0.0002). 


range of bump positions to 180°. Indeed, we found that—in the major- 
ity of flies (6 out of 10)—experiencing this limited range of bump 
positions was sufficient to induce a stable heading that matched the 
imposed offset in the probe period of the trial (Extended Data Fig. 4d, e). 
We could successfully induce a full remapping of the single-stripe scene 
ina few flies even in a more-constrained situation in which the range 
of bump positions spanned only 60° (in 7 out of 20 flies) (Extended 
Data Fig. 4i-k). Further analysis revealed that successful remapping 
was more likely when the stripe and the bump started inside the newly 
mapped region inthe probe trial, consistent with simulations (Extended 
Data Fig. 4f-h, j,k). This probably occurred because the internally gen- 
erated angular velocity signal could move the bump into regions that 
were not previously traversed while still preserving the new offset, 
thereby allowing the new heading representation to stabilize. We also 
observed full remapping after limited-angle exposure in experiments 
with a natural scene (Fig. 4a, c, d). These results provide insights into 
how Hebbian plasticity combined with ring attractor dynamics enables 
the fly to convert information gathered from limited views of a novel 
scene into a complete heading representation within that scene. 


Stability of the compass in two-dimensional scenes 

Looking across all experiments, we observed that heading represen- 
tations exhibit a varying degree of stability across different scenes 
(Fig. 5a, b). We wondered whether structure in the vertical dimen- 
sion—typical for natural scenes and known to be encoded by visual 
ring neurons”*"**°—could resolve potential ambiguities in scenes 
with repeating visual features in the horizontal dimension (for example, 
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Fig. 5| The stability of bump dynamics is predicted by two-dimensional 
information in visual scenes. a, Four natural scenes (F, forest (Fig. 1h);O, open 
field (Fig. li); D, dense forest; and B, bush) were downsampled and discretized. 
Two artificial scenes with the same local features at the same (SE) and different 
elevations (DE) were also used. b, Circular variance of instantaneous pinning 
offset with natural scenes (4 repetitions of 2scenes per fly, 40 trials from 

10 flies for each condition) (Methods). The bump reliably tracked the 
orientation of forest and open-space scenes (indicated by lowcircular 
variance). F, mean =0.2771, 95% confidence interval = [0.2231, 0.3344]; O, 
mean = 0.2180, 95% confidence interval = [0.1616, 0.2828]). Tracking was poor 
(ahigh circular variance) for dense-forest and bushscenes. D, mean =0.5163, 
95% confidence interval = [0.4557, 0.5752]; B, mean =0.5528, 95% confidence 
interval =[0.4893, 0.6135]). Bootstrap tests of the difference in mean circular 
variance between each pair of scenes showed significant difference across all 
pairs (two-sided, P< 0.0001), except between forest and open-space scenes 
(two-sided, P= 0.169) and between dense-forest and bush scenes (P= 0.406). 

c, Circular variance of instantaneous pinning offset for the same-elevation and 
different-elevation artificial scenes (4 repetitions of both scenes per fly, 
40trials from 10 flies) (Methods). The offset was stable (alow circular variance) 
for the different-elevation scene (mean = 0.1212, 95% confidence interval = 
[0.1046, 0.1392]), but not for the same-elevation scene (mean = 0.7521, 95% 
confidence interval = [0.6937, 0.8045]). The mean circular variance between 
scenes was significantly different (two-sided bootstrap test, P< 0.0001). 

d, Two-dimensional autocorrelation of natural scenes. e, One-dimensional 
(1D; right) and two-dimensional (2D; left) autocorrelation of the same-elevation 
and different-elevation artificial scenes; the one-dimensional autocorrelation is 
identical, but the two-dimensional autocorrelations are different. 


the ‘same-elevation’ scene in Fig. 5a). Using artificial stimuli, we found 
that the bump reliably tracked the orientation of an artificial scene 
with four identical objects placed at different elevations, whereas it 
could not stably track when these objects were placed at the same 
elevation (Fig. 5c). This stability is well-predicted by the fact that the 
two-dimensional autocorrelation of each scene is distinctly single- 
peaked (Fig. 5d, e). We conclude that the two-dimensional organiza- 
tion of ascene”®"*” contributes to the generation and stability of the 
pinning offset. 

Some insects are capable of snapshot-based navigation???) in 
which stored visual scenes are recalled to drive scene-specific direc- 
tional actions. Further analysis of our model indicated that multiple 
visual maps can be stored simultaneously if plasticity between visual 
ring neurons and compass neurons is presynaptically gated and the net- 
work has access toa rich ring-neuron representation of visual scenes» 
(Extended Data Figs. 5, 6, Supplementary Information). Other spatially 
informative sensory inputs—including spectral*”, mechanical (for exam- 
ple, wind’) and olfactory cues**—may also contribute to differentiating 
natural sensory environments. 
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Discussion 


We have shown howinhibitory Hebbian plasticity can rapidly transform 
visual feature information into an attractor-driven internal represen- 
tation. Angular velocity input to the attractor converts an emerging 
mapping on the basis of limited views of a scene into a complete and 
consistent heading representation, a potentially critical function in 
animal navigation. The induction of inverse maps emphasizes the 
notable flexibility of the system. A key issue that remains unresolved 
is the nature of bump dynamics during translation in a two-dimen- 
sional environment. Mammalian head-direction cells are unaffected 
by translation’, but our model suggests that the compass circuit tracks 
the angle between the orientation of the fly and an object in the visual 
scene without correcting for translation—potentially making it alocal 
compass. However, the plasticity that we have identified required onlya 
few minutes, and may be even faster under natural conditions when the 
system can co-opt an existing mapping from ring to compass neurons. 
In our simulations (data not shown), this timescale prevented nearby 
objects and transient stimuli—such as neighbouring conspecifics that 
would not move coherently with the bearing of the fly—from being 
mapped, but tethered the compass to distant objects that moved coher- 
ently with the turns of the fly. 

The locus of plasticity is likely to be synapses between ring and com- 
pass neurons; this idea is also favoured by the authors of the accom- 
panying Article”, who present electrophysiological evidence that 
is consistent with plasticity altering inhibitory visual inputs to 
individual compass neurons. At a synaptic and biophysical level, it 
remains to be seen how the Hebbian mechanism that we have proposed 
relates to, and interacts with, other forms of plasticity such as spike- 
timing-dependent plasticity**°, or with plasticity-inducing 
mechanisms such as nitric oxide signalling in the ellipsoid body”, 
dopaminergic modulation (as seen in the fly mushroom body***’) or 
plateau potentials (as seen during remapping of hippocampal place 
cells*’). 

Our results support a model in which plasticity is constantly active 
to allow rapid adaptation to new settings, enabling the ring attractor 
to generate a single heading direction even in acomplex environment. 
Such stable sensorimotor representations probably enable animals to 
overcome transient uncertainties in their surroundings as they pursue 
diverse behavioural goals. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized and investigators were not blinded 
to allocation during experiments and outcome assessment. 


Nomenclature 

We follow an abbreviation convention that is agreed upon by most 
research groups working onthe central complex. For E-PG or compass 
neurons”, E (ellipsoid body) before a ‘~’ represents predominantly 
spiny and putatively postsynaptic processes, and P (protocerebral 
bridge) and G (gall) after a‘—’ represent predominantly bouton-like and 
probable presynaptic processes. When fully expanded, the abbrevia- 
tion E-PG stands for PB,,_s.b-EBw.s-D/Vegall.b®. Similarly, P-EN neurons 
(Fig. 1), which arborize in the noduli (N), refer to PB,,_5.S-EBt.b-NO1.b 
neurons”°. 


Terminology 

Inthe manuscript, we use the term heading representation to describe 
what it is that the E-PG neurons encode. However, the representation 
often persists when a tethered fly is standing still on a ball?—that is, 
when it has no heading in a strict sense. On the basis of such data, we 
would define heading as the angular orientation of the body axis of the 
fly ina visual scene. Future experiments may well determine that E-PG 
neurons represent the head direction of the fly, but all E-PG imag- 
ing experiments thus far—including those in this study—have been 
performed on head-fixed flies in tethered preparations, leaving this 
issue unresolved. 


Fly stocks 

Fly stocks have previously been described”. In brief, flies with 
either a codon-optimized UAS-GCaMPé6f* or a recombinant of UAS- 
CsChrimson-mCherry-tag” and UAS-GCaMP¢éf or codon-optimized 
UAS-GCaMP6f*' were driven by split-GAL4°?** SS00096 from the Rubin 
laboratory. All experiments were performed with 6-10-day-old female 
flies. Flies were randomly picked from their housing vials for all experi- 
ments. All flies were raised from the egg stage on standard cornmeal and 
soybean-based medium” or with additional 0.2 mMall-trans-retinal” 
for flies with CsChrimson. 


Fly preparation for imaging during head-fixed flight 

The procedure for fly preparation has previously been described”. 
In brief, flies were anaesthetized onacold plate at 4 °C. The front legs 
were removed and the proboscis was pressed into its head capsule 
and immobilized with wax to minimize brain movement. The fly was 
tethered at the tip of a tungsten wire and positioned under a custom- 
designed stainless-steel shim, as previously described*°°”. The back 
of the head capsule was kept nearly vertical to maximize exposure of 
the eyes of the fly to the surrounding light-emitting-diode (LED) arena. 
UV-curable adhesive was used to fix the head under the shim, then 
the cuticle at the top of the head and fat cells were carefully removed 
and trachea were carefully pushed to the back of the brain to optically 
reveal the central brain. 


Visual stimulation 

Visual arena. The hardware has been described”. In brief, a female fly 
was placed at the centre of the arena and visual stimuli were presented 
onavertically placed cylindrical LED display*’ spanning 330° in azimuth 
and 60° in elevation. The display was covered with multiple layers of 
colour filter to avoid excessive leak into a photon detector and a dif- 
fuser to avoid reflection? ?*. The wingbeat amplitude of each wing 
was computed online by analysing images acquired with a camera, 
using custom-built image analysis software written in MATLAB, simi- 
lar to a previously described method”. The image acquisition rate of 
the camera was 119.2 Hz, which was slow enough to capture the full 


shadow of wings to compute the wingbeat amplitude. For closed-loop 
experiments, the gain was 5.1° per second for each degree of the dif- 
ference between the left and right wingbeat amplitudes (AWBA)®. Air 
was manually puffed at the fly if it stopped flying. The data during this 
stalled period were excluded from analyses. 


Stimuli. We used various visual stimuli. Natural scenes were derived 
from panoramic photographs taken at the Janelia Research Campus. 
Using the full luminance resolution of the arena resulted in excessive 
leak into a photon detector even after multiple layers of filters, making 
it impossible to detect bump position (especially with extremely low 
laser power used for simultaneous imaging and optogenetic stimula- 
tion). Further, the level of light at full luminance was enough to activate 
CsChrimson in most flies. To reduce the light leak and undesired activa- 
tion of CsChrimson, we downsampled and monochromatized natural 
scene photographs (Fig. 1h, i, 4a, 5a) to four luminance levels close toa 
log scale (0, 2,6 and 15). Other visual stimuli included a bright vertical 
stripe spanning 60° in elevation and 15° in azimuth (Fig. 3a, Extended 
Data Figs. 1b, 3b, 4d, i), two bright vertical stripes 165° apart (Extended 
Data Fig. 3a), arandom dot pattern of which each pixel is either of maxi- 
mum brightness or dark, and patterns containing four small horizontal 
bars each spanning 30° in azimuth and 15° in elevation (Extended Data 
Fig. Sa, b). All the stimuli used in this study were presented on a blue 
LED arena. We used a greyscale in the figures for visual clarity. To avoid 
a sudden luminance change that might induce a startle response in 
flies, the 30° arena gap behind the fly was stitched in all protocols to 
maintain overall luminance. Thus, when an object crosses the gap, it 
does not disappear but jumps across it. 


Protocols 

Optogenetic bump offset shift. An experiment (14 flies) (Extended Data 
Fig. 1b-d, i,m) began with a1-min exposure toa closed-loop random dot 
stimulus (trial 1). It was followed by 3 1-min closed-loop single-stripe 
trials (trials 2-4), a5-min optogenetic manipulation trial that imposes 
afixed 90° offset between the bump andascene (trial 5), 3 1-min closed- 
loop single-stripe trials (trials 6-8), another 5-min optogenetic trial with 
-90° offset (trial 9), and 21-min closed-loop single-stripe trials (trials 10 
and 11). Each trial was followed by a 15-s dark trial before the next trial 
started. During optogenetic manipulation trials, 8 positions in the el- 
lipsoid body, separated by 45° (witha visual stimulus of acorresponding 
offset), were sequentially stimulated, each of which took approximately 
2-2.5s. The initial position of the visual stimulus during closed-loop tri- 
als was random. Trial 2 was used for flies to establish a stable offset. Trials 
3 and 4 and trials 7 and 8 were used to measure the baseline variability 
ofthe bump offset within a single fly before optogenetic manipulation. 
Trials 6 and 7 and trials 10 and 11 were used to measure the baseline 
variability after optogenetic manipulation. Trials 4 and 6 were used to 
measure the effect of optogenetic manipulation in trial 5 (90° offset). 
Trials 8 and 10 were used to measure the effect of optogenetic manipula- 
tion in trial 9 (-90° offset). Control experiments (ten flies each) used the 
same order of trials except that either CsChrimson was not expressed 
(Extended Data Fig. 1j, n) or the stripe was not presented (Extended Data 
Fig. 1k, o) during manipulation trials. A natural scene was also tested 
(Fig. 2a—d, Extended Data Fig. If, h, l). To increase statistical power, all 
data collected before or after the —90° protocol were rotated 180° and 
pooled with the 90° protocol during analyses. 


Bump offset shift with two vertical stripes. The order of trials was 
identical to optogenetic bump offset shift experiments, but—during 
manipulation trials—two stripes at opposite sides of the visual field 
(165° apart in the 330° arena) were presented under closed-loop con- 
trol (Extended Data Fig. 3d-i). Trials 6 and 10 were used to measure 
the number of bumps and the bump offset variance for the initial 15 s 
after manipulation trials, and trials 7 and 11 were used as control trials. 
Ten flies were tested. 


Forced optogenetic inverse mapping. There were two 1-min single- 
stripe closed-loop trials followed by 10 min of an optogenetic inverse 
mapping trial and 2 min ofa probe trial (Fig. 3). Consecutive trials were 
separated by a3-s dark trial. 


Natural scene protocols. Two 2-min closed-loop trials with a down- 
sampled and monochromatized forest scene were presented (trials 1 
and 2). They were followed by 2 2-min closed-loop trials with an open- 
space scene (trials 3 and 4), and all 4 trials were repeated (trials 5-8). 
All consecutive trials were separated by a 5-s dark trial. The initial 
scene orientation of each trial was random. Trials 2 and 5 were used to 
measure the offset shift between two forest-scene trials separated by 
open-space scene trials. Trials 4 and 7 were used to measure the offset 
shift between two open-space scene trials separated by forest-scene 
trials. Trials 2 and 3 were used to measure the offset shift during the 
transition from a forest scene to an open-space scene. Trials 4and 5 
were used to measure the offset shift during the transition from an 
open-space scene to a forest scene. Ten flies were tested (Fig. 1h, i, Sb, 
d). The whole protocol was repeated for another pair of less-reliable 
natural scenes (dense forest and bush) (Fig. Sa, b, d). Finally, to address 
the relevance of two-dimensional organization of the visual scene to 
the bump position computation, the same protocol was repeated with 
2 scenes of 4 artificial objects: in each scene, four horizontal objects 
were presented with equal azimuthal separation and either the same 
or different elevations (Fig. 5a, c, e). 


Bump offset shift with limited optogenetic manipulation. An ex- 
periment (Fig. 4, Extended Data Fig. 4) began with a1-min closed-loop 
trial with a single stripe (trial 1). It was followed by a2-min closed-loop 
single-stripe trial (trial 2),a30-s open-loop probe trial (trial 3), a5-min 
open-loop manipulation trial (trial 4), a30-s open-loop probe trial 
(trial 5),a2-min closed-loop trial (trial 6), a30-s open-loop probe trial 
(trial 7),a5-min open-loop manipulation trial (trial 8),a30-s open-loop 
probe trial (trial 9) and a2-min closed-loop trial (trial 10). All consecu- 
tive trials (except the probe trials following manipulation trials) were 
separated by a 3-s dark trial. The initial scene orientation of closed- 
loop trials was random. During trial 2, the bump offset was roughly 
determined by visual inspection. Then, a target offset was determined 
to be 180° away from this baseline offset and optogenetically imposed 
during manipulation trials. Three manipulation protocols were 
used (ten flies each). The first protocol (local protocol 1) spanned 
60° of the ellipsoid body, in which 3 positions separated by 30° 
were optogenetically stimulated. Each position was stimulated for 
1.5-2.5s in sequence. The probe trials were composed of the same visual 
stimuli used during optogenetics trials to measure the effectiveness of 
the optogenetic manipulation. The position of a stripe in closed-loop 
probe trials began at the middle of the range of stripe positions used 
during manipulation. The second protocol (local protocol 2) spanned 
60° of the ellipsoid body, in which three positions separated by 30° 
were optogenetically stimulated. Each position was stimulated for 
1.5-2.5s in sequence. During probe trials, two stripe positions (one 
at the centre of the manipulated area and another 180° away from it) 
were repeatedly presented (each for 3 s) to probe the global effect 
of local manipulations. The position of a stripe in closed-loop probe 
trials was random. For further analysis, flies from the two protocols 
(local protocols 1 and 2) were pooled (Extended Data Fig. 4j, k) and 
regrouped depending on the position of the bump and the stripe at 
the beginning of the probe trial. The final protocol (local protocol 3) 
spanned 180° of the ellipsoid body (Extended Data Fig. 4d), in which 
8 positions separated by 22.5° were optogenetically stimulated. The 
same probe stimuli as in local protocol 2 were used in addition to 8 
stripe positions separated by 45° to cover all orientations. The offset 
during probe trials was measured over the final 5s. The final protocol 
was repeated with a natural scene (Fig. 4). 


The position of the pattern, wingbeat amplitudes, air-puffing signal 
and two-photon frame trigger were all simultaneously collected using 
custom software written in MATLAB that used National Instrument 
data acquisition hardware. 


Two-photon calcium imaging 

Calcium imaging was performed using a custom-built two-photon 
microscope”. We used a 40x objective (NA 1.0, 2.8 mm WD) and aGaAsP 
photomultiplier tube (PMT). A Chameleon Ultra II laser tuned to 930 nm 
witha custom-built pulse compressor was used as the excitation source 
with amaximum power of 8 mW at the sample. We used the same saline 
as in previous studies? with adjusted calcium concentration at 2.0 mM. 
We imaged the ellipsoid body over 6-plane volumes using a fast remote 
focusing technique“, which was modified in-house, at a rate of 9.8 Hz 
volume rate (256 x 256 resolution, 58.8-Hz frame rate) with an equal 
spacing of 3-6 pm between individual scanning planes. The objective 
was tilted by 30° to enable imaging of the ellipsoid body with the head 
of the fly at a natural, vertical angle. 


Two-photon optogenetic stimulation 

The protocol used was largely along previously described lines”, but 
differed in a few details. A single two-photon laser source was used for 
bothimaging and optogenetic stimulation, by temporally modulating 
the laser power, which was implemented using the PowerBox feature 
in Scanlmage” replacing the custom MATLAB software described in 
previous work” (Extended Data Fig. 1a). Increased two-photon effi- 
ciency owing to a pulse compressor allowed a lower laser power for 
imaging and optogenetic stimulation than previously described. 
For the calcium-imaging-only period, a maximum laser power of 
2 mW was used for both forward and backward scanning phases. 
During optogenetic stimulation of CsChrimson, the laser power was 
kept the same except for the defined stimulation area only during the 
forward scanning phase, in which a maximum laser power of 30 mW 
(typically 20 mW) was used. To prevent tissue damage, this laser power 
was manually adjusted during each trial to a minimal power that was 
sufficient to develop a bump atthe site of stimulation. On average, the 
optogenetically induced GCaMP signal measured during the backward 
scanning phase was 13.3% greater than the normal condition across flies 
(one-tailed paired t-test, P= 0.022) in the optogenetic bump-shifting 
experiment with a natural scene. This higher-than-natural activity 
was required to inhibit the naturally generated bump. However, two 
vertical-stripe protocol results indicate that plasticity can be induced 
at the natural activity level. 


Data analysis 

We used MATLAB for data analysis. To avoid bias, no statistical methods 
were used to predetermine the power and the sample size. The fixed- 
offset optogenetic experiment used 14 flies, and the forced optogenetic 
inverse mapping experiments relied on 8 flies. All other experiments 
were performed until data from 10 flies were collected. 


Calculation of fluorescence changes. The background noise level 
was predetermined by measuring the oscillatory noise from the PMT. 
This level was then subtracted from all imaging data, and the data were 
half-rectified before further analysis. A running average intensity pro- 
jection of a volume (six planes) at a given time was computed for each 
pixel. Then, 16 ROIs were manually assigned, as previously described”. 
Next, time series for each ROI were obtained by taking the average of 
the fluorescence signal within the ROI at each point in time. For calcium 
imaging experiments without optogenetics, AF/F) was computed us- 
ing F, as the mean of the lowest 10% of signals in each ROI. No further 
temporal smoothing was applied. 


PVA of a bump and its amplitude. As a simple measure of the bump 
position and strength, the PVA was computed as the weighted vector 
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average across ellipsoid-body wedges, with the weight determined by 
the fluorescence level (AF/F,), and the vector determined by the posi- 
tion of each ROI in the ellipsoid body. The amplitude of the PVA was 
determined as the length of the average vector. We used brewermap 
(S. Cobeldick, MathWorks file exchange) with a colour scheme ‘blue’ 
from http://colorbrewer2.org/ to depict all PVA plots. 


Calculation of the number of bumps. For each frame, a bump was de- 
fined as any contiguous set of ROIs with AF/F, greater than a threshold 
value (defined in each frame to be the mean AF/F, across ROIs +1s.d.)? 
(Extended Data Fig. 3h). 


Offset between the estimated bump position and the pattern posi- 
tion, and offset deviation. For a given trial, the first 15 s were discarded, 
as were time points when the fly did not fly, which were determined by 
the wingbeat amplitude. The offset between the absolute scene orienta- 
tion (to the experimenter) and the PVA estimate was calculated as the 
mean angular difference for the remaining time. The deviation was 
calculated as the circular variance. The visual arena (covering 330°) 
was mapped to 360°, as was the position of the scene. 


Analysis of optogenetic offset manipulation trials. The exact artifi- 
cial offset imposed by optogenetic stimulation during manipulation 
trials was determined by the mean angular difference between the scene 
orientation and the PVA during optogenetic stimulation. 


Circular linearity test. For the optogenetic manipulation protocol, 
the expected amount of offset shift was assumed to be the same as the 
artificially imposed amount of shift. The sum of absolute angular differ- 
ence between these two values across flies was used as a test statistic. 
To obtain the null distribution, the observed amounts of shift were 
randomized across flies and the sum of absolute angular differences 
was calculated, all of which was repeated 10,000 times. The Pvalue was 
calculated by counting the number of outcomes from randomization 
that were smaller than the test statistic (Extended Data Fig. 1h-k). 


Circular unimodality or circular asymmetricity test. We used this 
test to determine whether a set of directional data was significantly 
unimodal or asymmetric. The circular variance of the data was used 
as a test statistic. Each data point was assigned a random direction 
sampled from a circularly uniform distribution, after which the cir- 
cular variance was calculated. This random assignment procedure 
was repeated 10,000 times to generate a null distribution. The Pvalue 
was determined by the number of times at which the circular variance 
was smaller than the test statistic (Fig. 1j, Extended Data Fig. 5c). This 
method reliably works only for unimodal data and may generate false- 
negative results for multimodal data. 


Bootstrap test of the mean difference. This test was used to establish 
the difference of means of two datasets when they did not satisfy the 
assumption of Gaussian distributions. The difference of means of two 
datasets was used as atest statistic. Two sets of data were pooled, random 
samples were assigned to each group either with (bootstrap) or without 
(randomization) replacement, and the difference of the means of the 
two groups were calculated. This process was repeated 10,000 times to 
generate the null distribution. The P value was computed by counting 
the number of events with an outcome that was greater than the test 
statistic (Extended Data Fig. 1I-o). Random sampling both with and 
without replacement generated similar Pvalues in all tests in our study. 


Circular variance of pinning offset. The variance in pinning offset 
relative to each scene (Fig. 5b, c) was computed as the circular variance 
of the instantaneous pinning offset along the time of asingle trial. Each 
fly experienced four repetitions of two scenes. For each scene, all trials 
were pooled across flies (in total, 40 trials each). 


Circular variance of inverse map. The circular variance of the bump 
offset during the probe trial was calculated for both normally arranged 
ellipsoid-body ROIs and inversely arranged ellipsoid-body ROIs. If the 
circular variance of the latter was smaller than the former, the mapping 
from the visual scene orientation to compass neurons was determined 
to be inverted (Fig. 3d). 


Binomial exact test. For Extended Data Fig. 4j, the baseline probability 
of flies shifting their offsets by more than 90° is 1 out of 7 if the stripe 
starts outside manipulated positions (red dots). Assuming binomial 
sampling from this distribution, the chance of 6 or more flies out of 
13 shifting their offsets by more than 90° (blue dots) is P=0.0059. For 
Extended Data Fig. 4k, the baseline probability of flies shifting their 
offsets by more than 90° is 3 out of 16 ifthe stripe or bump starts outside 
the manipulated positions (red dots). The chance of all 4 flies shifting 
their offsets by more than 90° (blue dots) assuming binomial sampling 
with a probability of 3/16 is P= 0.0012. 


Natural scene analysis. Each scene was smoothed with atwo-dimension- 
al Gaussian filter with a s.d. of 4 pixels (Extended Data Fig. Se). Then, the 
two-dimensional autocorrelation of each scene was calculated (Fig. 5d). 
Eachscene was tiled horizontally (three copies) and the top and the bot- 
tom were padded with zeros. Then, MATLAB function xcorr2 was applied 
to this tiled scene, and to another scene representing the centre of this 
tiled scene. The middle range of azimuth values of the outcome (cor- 
responding to the azimuthal range of one scene within the tiled image) 
was finally normalized by the maximum value to obtain two-dimensional 
autocorrelation. The one-dimensional autocorrelation was obtained by 
first taking the average intensity of the smoothed scene over elevation, 
then applying xcorr between this one-dimensional trace and a concat- 
enated version of this trace, and finally normalizing by the maximum 
value. The two-dimensional cross-correlation was computed inthe same 
way, except that xcorr2 was applied to two tiled scenes: one scene with 
three horizontal copies of itself padded at the top and bottom, and an- 
other scene without horizontal copies but padded at the top andbottom. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1| See next page for caption. 


Extended Data Fig. 1| Manipulation of pinning offset of heading 
representation relative to visual scene. a, Schematic shows simultaneous 
calcium imaging and localized optogenetic stimulation. b—d, Snapshots of 
compass-neuron population activity before, during and after optogenetic 
manipulation in open loop (orientations of imposed single-stripe visual scene 
are shownat the top). b, A bump offset of close to zero before optogenetic 
manipulation (arrowineshows the time of this snapshot). c, Optogenetic 
imposition of the new offset. Left, when the vertical stripe is in front of the fly, 
the bump was imposed on the right side of the ellipsoid body (rectangle). Right, 
45° rotated scene and bump withthe same offset as shown on the left. This 
offset was sequentially imposed across eight positions of the visual scene and 
ellipsoid body for approximately 2s per position for 5 min (e middle). d, 
Snapshot of compass-neuron calcium transients after manipulation 

(e bottom). The bump position relative to same visual scene as in bis now 
shifted by the offset imposed inc. e, Segments (60s) of imaging before (top), 
during (middle) and after (bottom) a5-min optogenetic manipulation. 
Conventions are the sameas in Fig. 1. f, Bootstrapped distribution of the mean 
difference between the imposed and actual offset shifts in Fig. 2 (natural 
scene), which was not significantly different from 0 (19 trials from 10 flies, 
bootstrapped mean difference test, two-sided, P= 0.6276). g, Bootstrapped 
distribution of the mean difference between the imposed and actual offset 
shifts in b—d (single stripe), which was not significantly different from 0 

(25 trials from 14 flies, two-sided, P= 0.8932). h-k, Distribution of imposed 


(xaxis) versus actual (y axis) offset shifts across flies. The distribution is 
significantly linear along the identity line (circular linearity test. h, Natural 
scene, 19 trials from 10 flies, P< 0.0001. i, Single stripe, 25 trials from 14 flies, 
P<0.0001.j, No CsChrimson, 14 trials from 10 flies, P=0.0934. k, In darkness, 
17 trials from 10 flies, P= 0.6064). I-o, Absolute change in offset across two 
trials before manipulation (blue) and across two trials after manipulation 
(yellow), and absolute change in offset induced by manipulation (red). 
Bootstrapped mean difference tests, one-sided. n values are the sameasinh-k. 
I, Natural scene, bootstrapped mean difference test between epochs before 
and during manipulation, P= 0.0464; and between epochs during and after 
manipulation, P= 0.0024. m, Single stripe, bootstrap tests of the mean 
difference showed a significant difference between the baseline offset shifts 
and manipulated offset shifts (P=0.0207 between epochs before and during 
manipulation; and P= 0.0252 between epochs during and after manipulation). 
n, NoCsChrimson control, bootstrap tests of the mean difference did not show 
any significant difference; P> 0.05 for all pairs. o, Darkness control, bootstrap 
tests of the mean difference did not show any significant difference; P> 0.05 for 
all pairs. Baseline offset shifts were comparable to the experimental group (m), 
but greater than the control group without CsChrimson (n). This suggests that 
the baseline offset variance in the experimental group might be due toa higher 
baseline activity of the compass-neuron population, induced by weak 
activation of CsChrimson during two-photon imaging. 
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Extended Data Fig. 2 | Simulation showing the mapping of acomplex scene 
onto astable heading representation and optogenetic bump offset shifting. 
a, Acomplex one-dimensional scene was generated via a mixture of four von 
Mises functions with random mean directions and random concentration 
parameters, shown for t= 0.b,c, Model simulation. Ring-neuron population 
activity (b, top) serves as the assumed source of visual input. A time series of 
angular velocity obtained from tethered flight data was used to compute 
movement of the visual scene. b, Bottom, compass-neuron population activity 
during simulated orientation. c, Time-varying synaptic weights between ring 
and compass neurons. The simulation began with random synaptic weights 
(left) and random initial activity of compass-neuron population. Ring attractor 
dynamics ensures a stable bump, albeit with arandom offset. The initial 
turning of bump is not enforced by visual cues but by the angular velocity signal 
from tethered flight data. The same 400-s turning signal was repeated 3 times 
(Supplementary Information). Synaptic weights stabilize over time (c, right). 
After learning, a vertical cross-section of the stabilized synaptic weight matrix 
resembles the model ring-neuron activity profile shown ina. d, Simulation of 
optogenetic shift in offset. The simulation began with the stable mapping 
showninc.e, During the probe trial, the newly mapped offset was 
consolidated. All simulation results shown are based ona post-synaptically 
gated plasticity rule, unless otherwise stated. Extended Data Figures 5, 6 and 
Supplementary Information provide the differences in predictions made by 
post- and pre-synaptically gated plasticity rules. 
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Extended Data Fig. 3| Bump dynamics after a closed-loop two-stripe 
manipulation. a—c, Simulation of the time evolution of the synaptic weight 
matrix, induced by a visual scene with two vertical stripes. Conventions are the 
sameas in Extended Data Fig. 2.a, The simulation began with the stabilized 
synaptic weight matrix shown in Fig. 2e. Visual input provided was two narrow 
von Mises functions, separated by 180°. Ring attractor dynamics ensured that 
the compass-neuron population maintained a single bump. Over time, the 
synaptic weight matrix develops two distinct bands of weak synapses (right), 
representing weakened connections from two active sets of ring neuronstoa 
compass-neuron bump. b,c, When the system is then presented witha visual 
scene that has only one vertical stripe, there are two possible outcomes: ring 
attractor dynamics stabilizes an offset that is either shifted 180° from the 
original offset (b) or the same as the original offset (c). d-i, Natural bump- 
offset shifting with two identical vertical stripes (no optogenetic 
manipulation) separated by 165° ina 330° arena. d-f, Segments (60s) of 
compass-neuron calcium transients before (d), during (e) and after (f) 
manipulation. Conventions are the same as in Fig. 2d, except that the red line 
represents the position of either one (d, f) or two (e) stripes. Imaging snapshots 
shown in the left panels were taken at times indicated with arrows beneath 
right panels. The bump offset is shifted by 180° inf, relative to its positionine 
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(Supplementary Video 4). g, Distribution of the absolute shift in offset 
measured across trials from all flies. Left, baseline variance; change in offset 
across twotrials before manipulation. Right, baseline variance; change in 
offset across two trials after manipulation. Centre, change in offset across two 
trials separated by a manipulation trial. In three cases (n=19), the shift in offset 
was close to 180°. Unlike in simulations, in most two-stripe trials the bump 
position covers only half of the ellipsoid body because of the circular symmetry 
of the stimulus, which may underlie the apparently low yield of shifting (but see 
handi; see Supplementary Information for further discussion). h, The number 
of bumps during the initial 15 s of 16 trials that did not exhibit a shift of 180° was 
significantly greater in trials that immediately followed a manipulation trial 
(red) thaninasubsequent trial (blue) (bootstrap test of the mean difference, 
one-sided, P= 0.0004), indicating that initial competition between two bumps 
eventually stabilizes to a single bump. This implies that the manipulation trial 
generated two competing offsets. i, The deviation of the bump offset during 
the initial 15s relative to the average bump offset during final 30s of thesame 
trial was also significantly greater in the trial immediately following a 
manipulation trial than ina subsequent trial (bootstrap test of mean 
difference, one-sided, P= 0.0036), whichis a natural consequence of 
competition between two alternating bumps before one stabilizes. 


Article 


a OPTO-STIM b PROBE 
a 
32 t=0s t= 119.995s t = 839.995s s t= 200s = ; t= 1200s 
£03 F ° 
aD no} 
8 ‘a 3 
Fd @ 2 g 
a 2 ; 
2 ES a 8 
2 8® = ry 
5. = pls 
n 1 uy 
BEFORE 
ii) ae 0 i AFIFo PVA 
ee z WW WW) 165 8 2 ha 
a ol Bis 
2 Yi m { . ” : 0 4 
P 
f f i aes %JLos 
A 1 


Ring neurons 32 1 Ring neurons 32 


t=51.0 OPTO-STIM 


180° 


mm ES 2 6 OLeeCUe let 
16 = 165 
fo) $ 
4 ° © 200/| MI 40 e 
fe) 
iq ls “TS ses ||| |20 
A c 
T= 
PROBE 3 
MT a AV 8 5 
£ 
6 1.4 o 
o 
4 ; g 
‘ E: 
0 06 <x 
f OPTO-STIM g PROBE 
t= 107.995s t= 647.995s 
32 id 
Qn 
# 0.3 Fr 4 S 
= Oo 
$ Be 5 8 
s°° 85 Fa ste 3 Bits a 
s ES g 2 E 
Por be rl S. balls 3 
o f =] 4 
3 cf. 
1 = 
1 Ring neurons 32 h 
I 
BEFORE 
iii inh ia Ca AFIFo PVA 3 
16 ; 165 re g 
oD 7 § 
2 1 s) 
& id 2 
) D o| || Jo6 
! - a 1‘ bo 1 Ring neurons 32 1 Ring neurons 32 
t=29 OPTO-STIM 
j= ee) ee ee 
ee aa | (ee eT 
oO 
i = p=0.0013  p=0.012 p=0.0076 p=0.0017 
S) 0 © s00}) (2° — rm a ro 
a = 
a ee kee os 10 x x 
4b i 1 ! 1 \ 165 (0 180 
K es ee é ° 
i e ad ® e 
t= 30.8 PROBE 3 
REE O—T a oe 5 e 
e 
16-7 Ter WV Ww 165 £ 90 
6m mics o 
‘liaise a 4 etl F 
fe) ; 1.2 3 e 
“ 1 2 8 3 ° 
y : 3 e Ze feo) 
1 LA NAWAWREN Ura 165 t) aS <oo & o. 
0 10 20 30A 40 50 Stripe Stripe Stripe & bump Stripe or bump 


Tims (s) inside outside inside outside 


Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | Global offset shift by local optogenetic manipulation. 
The conventions are the same as in Extended Data Fig. 2. a—e, Local optogenetic 
manipulation spanning 180°. a, The simulation begins witha stabilized 
synaptic weight matrix, shown in Fig. 2e. Over time, anew map spanning 180° 
replaced approximately half of the original map (right). A portion of the 
synaptic weight matrix, corresponding to visual orientations that were not 
presented, was erased over time (top right corner of right panel). b,c, After 
manipulation, two potential maps (the original map and the newly imposed 
map) compete. Which map itis that eventually stabilizes and strengthens 
depends onwhether or not the bump and stimulus begin in the newly mapped 
region of the ellipsoid body in the trial that immediately follows manipulation. 
d, Compass-neuron calcium transients before (top), during (middle) and after 
(bottom) optogenetic manipulation spanning 180° of the visual scene and the 
ellipsoid body. The conventions are the same as in Fig. 2d. Compare the offsets 
inthe top and bottom panels. e, Distribution of the absolute shift in offset, 
measured across flies. White dots, baseline before manipulation; black dots, 
offset shift by manipulation (10 flies, bootstrapped mean difference test, one- 
sided, *P< 0.0001). f-k, Local optogenetic manipulation spanning 60°. f, The 
simulation begins with the stabilized synaptic weight matrix shown in Fig. 2e. 
Over time, the newly imposed map replaces a portion of original map, which 
spans more than 60° because of the non-zero width (118° tail to tail) of the 
bump (bottom right). g, h, After the manipulation, two potential maps (the 
original map and the newly imposed map) compete. After the epoch of 
manipulation, ifthe bump begins in the manipulated region (g), the new map is 
likely to dominate and eventually strengthen. i-k, Optogenetic manipulation 


spanning 60° of the visual scene and the ellipsoid body. i, Segments (60 s) of 
compass-neuron population activity before (top), during (middle) and after 
(bottom) manipulation. The position of stripe (bottom) is not inthe 
manipulated domain, yet the bump is shifted to the optogenetically imposed 
offset (compare the offsets in the top and bottom panels).j, Left, datafrom 
60°-span manipulation, after which aclosed-loop probe trial begins with the 
stripe in the position that was sampled during manipulation. Open dots, 
baseline variance of the offset around mean, before manipulation. Solid blue 
dots, shift in offset induced by 60°-span manipulation. Across the population, 
the shift was significant (bootstrapped mean comparison, one-sided, 
P<0.0013). Right, data from 60°-span manipulation, after which closed-loop 
probe trial begins with the stripe outside the set of positions sampled during 
manipulation. Open dots, baseline variance. Solid red dots, shift in offset 
induced by manipulation. The shift was only marginally significant across the 
population (bootstrapped mean comparison, one-sided, P= 0.012). The global 
extrapolation of local manipulation was facilitated when the stripe began in 
manipulated positions inthe probe trial (binomial exact test, *P=0.0059) 
(Methods). k, Same data as inj but re-categorized. Left, in probe trials, both the 
bump and stripe began ina position sampled during the manipulation (4 out of 
20 flies). All 4 flies showed a greater-than-90° shift during probe trials. Right, 
all other conditions (16 out of 20 flies). In total, 3 out of 16 flies showed a 
greater-than-90° shift. The facilitation of global extrapolation when both the 
bump and stripe began in manipulated positions was significant (binomial 
exact test, *P= 0.0012) (Methods). 
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Extended Data Fig. 5| See next page for caption. 
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Extended Data Fig. 5 | Deterministic offset difference between two artificial 
scenes with the same local feature but different two-dimensional 
organization. The Supplementary Information provides a detailed discussion. 
a, Compass-neuron calcium transients measured during closed-loop tethered 
flight in an artificial scene, arrangement A (A). The conventions are the same as 
in Fig. 1h. b, Calcium transients from the same fly as ina, but witha different 
artificial scene, arrangement B (B).c, Distribution of the mean offset of each 
trial, pooled across all flies (Methods). Distributions of offsets relative to 
scenes A and Bwere not significantly different from uniform (n=40 trials from 
10 flies, unimodality test by randomization, P= 0.0819 for A, P= 0.1525 for B). 
Compare with Fig. lj. d, Distribution of offset shifts between two trials. The 
distribution of offset shifts between two artificial scenes, measured across 
flies, was significantly different from uniform distribution (unimodality test by 
randomization, from AtoB,n=10 flies, P< 0.0001; fromBtoA,n=10 flies, 
P<0.0001). The shift in offset was similar across different encounters with 
same scene, indicating that the offset was stable (unimodality test by 
randomization, from AtoA, n=10 flies, P=0.0001; from BtoB, n=10 flies, 
P=0.0004). Compare with Extended Data Fig. 6e. e, Parameter sweep to 
explore how two-dimensional Gaussian filters of different s.d., applied tothe 
artificial scenes ina (arrangement A) and b (arrangement B), would affect shifts 
in offset between the two scenes. Filters represent the simplified effect of ring- 
neuron filtering of scenes. Shifts in offset should approximately match 
azimuthal shifts that would produce the best match (that is, maximum two- 
dimensional cross-correlation) between the filtered scenes. Each axis 
represents increasing s.d. of the applied two-dimensional Gaussian filter (g). 
The point marked witha red X is shown inf. f, Two-dimensional cross- 
correlation between two scenes inaand b after applying two-dimensional 
Gaussian filtering with 15° s.d. (red X ine). This filter size corresponds toa30° 


full-width at half-maximum receptive field, which matches the average size of 
the minor axis of ellipses that fit ring-neuron receptive fields’. Higher filter 
sizes up to 60° full-width at half-maximum (the average size of the major axis of 
elliptical fits of ring-neuron receptive fields) require similar azimuthal 
shifts to obtain a best match between the scenes (not shown ine). The 
azimuthal shift for the best match for this range of filters is 165°, a half rotation 
of the scene on the visual arena (as observed ind). g, Scenes inaandb after 
applying Gaussian filtering with 15° s.d. h, i, Simulation of pre- and post- 
synaptically gated plasticity rules applied when the model network is exposed 
tothe two different filtered scenes shown ing. h, Evolution of the synaptic 
weight matrix with a pre-synaptically gated plasticity rule. Top left, initial 
random synaptic weight matrix from 8 x 32 ring neurons to 1 of 32 compass 
neurons. Top right, after exposure to scene A. Each compass neuron responds 
most toa snapshot of the scene ata particular orientation. Second row, after 
exposure to scene B, anew snapshot is mapped to the compass-neuron heading 
representation. The locations of the top two horizontal bars in arrangements A 
and B overlap (red rectangles), which corresponds toa 165° shift in the two- 
dimensional cross-correlation in eand f (or a180° shift in the 360° arenain 
simulations). This deterministic offset shift results in the same pinning offset 
and aretrieval of the same heading representation as before when the sceneis 
repeated later (bottom two rows). The third and fourth rows show repeated 
exposure to scenes A and B. Bottom two rows, retrieval of the original offset. 

i, Evolution of the synaptic weight matrix with post-synaptically gated 
plasticity rule. The result is almost identical to h, given that all ring neurons and 
compass neuronsare activated during simulation.j, k, Simulated offset shifts 
with pre-synaptically (j) and post-synaptically (k) gated plasticity rules. For 
each rule, 100 simulations were performed. Both the pre-synaptic and the post- 
synaptic rules reproduced the population dataind. 
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Extended Data Fig. 6 |Memory capacity of different plasticity rules. 

a-d, Simulation of pre- and post-synaptically gated plasticity rules with simple 
two-dimensional scenes. a, Initial random synaptic weight matrix from 

2x 32 ring neurons to 1 of 32 compass neurons. b, Two simple simulated scenes 
activate mutually exclusive ring neurons. T, top ring neurons are active; B, 
bottom ring neurons are active. c, Evolution of synaptic weights for a pre- 
synaptically gated plasticity rule. Top left, initial random weight matrix before 
presenting scene T. Top right, after exposure to scene T, only synapses from 
active ring neurons (top row of ring neurons in e) were updated, while synapses 
from all other ring neurons (bottom row of ring neurons in e) remained intact. 
Second row, after exposure to scene B, ring neurons that were previously 
inactive became activated, and their synapses were updated. Third row, when 
scene T was presented again, the offset between scene orientation and bump 
position was the same as whensceneT was first presented (f).d, Evolution of 


synaptic weights for a post-synaptically gated plasticity rule. Synapses from 
inactive ring neurons are erased upon each encounter witha newscene. This 
would shift offset across two encounters of the same scene if the fly 
experiences a different scene between them. e, Population dataare fromten 
flies. Distribution of offset shifts between two trials in Fig. 1h, i. The 
distribution of offset shifts between two different natural scenes, measured 
across flies, is not significantly different from uniform distribution 
(unimodality test by randomization, from F to O, P= 0.489; from OtoF, 
P=0.1504). Different encounters of the same scene lead to similar, near-zero 
offset shifts, indicating stability of offset (unimodality test by randomization, 
from F toF, P=0.0035; from Oto O, P< 0.0001). f, g, Simulated offset shifts with 
pre-synaptically (f) and post-synaptically (g) gated plasticity rules. For each 
rule, 100 simulations were performed. 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size The number of flies to collect was chosen to be 10, which is a sufficient number to perform non-parametric testings, for all experimental and 
control groups, unless mentioned otherwise in Methods. 


Data exclusions No exclusion of flies. In each fly, segments when fly was not flying were excluded from analyses. 


Replication The optogenetic bump offset shifting has been replicated ten times with different conditions, all of which reproduced significant effects. Five 
of them are reported in the manuscript. 


Randomization _ All flies in this study were randomly selected from their housing vials. 


Blinding Not applicable. 
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Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 
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Human research participants 


Clinical data 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals 5- to 8-day old female Drosophila melanogaster 

Wild animals The study did not involve wild animals. 

Field-collected samples The study did not involve samples collected from the field. 

Ethics oversight No ethical approval or guidance was required for Drosophila physiological experiments. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Multiplexed RNA sequencing in individual cells is transforming basic and clinical life 
sciences! *. Often, however, tissues must first be dissociated, and crucial information 


about spatial relationships and communication between cells is thus lost. Existing 
approaches to reconstruct tissues assign spatial positions to each cell, independently 
of other cells, by using spatial patterns of expression of marker genes**—which often 
do not exist. Here we reconstruct spatial positions with little or no prior knowledge, 
by searching for spatial arrangements of sequenced cells in which nearby cells have 
transcriptional profiles that are often (but not always) more similar than cells that are 
farther apart. We formulate this task as a generalized optimal-transport problem for 
probabilistic embedding and derive an efficient iterative algorithm to solve it. We 
reconstruct the spatial expression of genes in mammalian liver and intestinal 
epithelium, fly and zebrafish embryos, sections from the mammalian cerebellum and 
whole kidney, and use the reconstructed tissues to identify genes that are spatially 
informative. Thus, we identify an organization principle for the spatial expression of 
genes in animal tissues, which can be exploited to infer meaningful probabilities of 
spatial position for individual cells. Our framework (‘novoSpaRc’) can incorporate 
prior spatial information and is compatible with any single-cell technology. 
Additional principles that underlie the cartography of gene expression can be tested 


using our approach. 


Single-cell RNA sequencing (scRNA-seq) has revolutionized our under- 
standing of the rich heterogeneous cellular populations that make up 
tissues, the dynamics of developmental processes and the underlying 
regulatory mechanisms that control cellular function’ *. However, to 
understand how single cells orchestrate multicellular functions, it 
is crucial to have access not only to the identities of single cells but 
also to their spatial context. This is a challenging task, as tissues must 
commonly be dissociated into single cells before sCRNA-seq can be 
performed, and thus the original spatial context and relationships 
between cells are lost. Two seminal papers tackled this problem com- 
putationally*°—the key idea being to usea reference atlas of informative 
marker genes as a guide to assign spatial coordinates to sequenced cells. 
This concept was successfully used in various tissues’ ™, including the 
early Drosophila embryo”. However, such methodologies rely heavily 
onthe existence of an extensive reference database for spatial expres- 
sion patterns, which may not always be available or straightforward 
to construct. Moreover, in practice the number of available reference 
marker genes is usually not large enough to label each spatial position 
with a distinct combination of reference genes, making it impossible 
to uniquely resolve cellular positions. More generally, marker genes, 
even when available, convey limited information, which could possibly 
be enriched by the structure of single-cell data. 

To this aim, we developed a new computational framework (novo- 
SpaRc), which allows for de novo spatial reconstruction of single-cell 
gene expression, with no inherent reliance on any prior information, 
and the flexibility to introduce it when it does exist (Fig. 1). Similar to 
solving a puzzle, we seek the optimal configuration of pieces (cells) 


that recreates the original image (tissue). However, contrary to atypical 
puzzle, here we do not have access to the image that we aim to recon- 
struct. Although the number of ways to spatially arrange (or ‘map’) 
sequenced cells in tissue space is enormous, our hypothesis is that 
gene expression in the vast majority of these arrangements will not be 
as organized as in the real tissue. For example, we know that typically 
there exist genes that are specifically expressed in spatially contiguous 
territories and are thus consistent with only a small subset of all pos- 
sible arrangements. We therefore set out to identify simple, testable 
assumptions that govern how gene expression is organized in space, 
and to subsequently find the arrangements of cells that best respect 
those assumptions. 


novoSpakRc charts gene expression in tissues 


Here, we specifically explore the assumption that cells that are physi- 
cally close tend to share similar transcription profiles, and vice versa 
(Extended Data Fig. 1a, Supplementary Methods). Biologically, this 
phenotype can result from multiple mechanisms, such as gradients of 
oxygen, morphogens and nutrients, the trajectory of cell development 
and communication between neighbouring cells. We stress that this is an 
assumption about overall gene expression across the entire tissue—not 
about individual genes and not about all cells that are physically close 
(Supplementary Methods). We show that, on average, the distance 
between cells in expression space increases with their physical distance, 
for diverse tissues in mature organisms or whole embryos in early devel- 
opment. Thus, to predict the spatial locations of sequenced cells, we 
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Fig. 1| Overview of novoSpaRc. A matrix that contains single-cell transcriptome profiles, sequenced from dissociated cells, is the main input for novoSpaRc. 
The output is a virtual tissue of achosen shape, which can be queried for the expression of all genes quantified in the data. 


seek to find a map of sequenced cells to tissue space (‘cartography’) 
such that overall structural correspondence is preserved—meaning that, 
overall, cells have similar relative distances to other cells in expression 
and physical space. The physical space is anchored by locations that may 
be either known (suchas the reproducible cellular locations inthe early 
stages of development of the Drosophila embryo”) or approximated 
by agrid (Supplementary Methods). The distances are first computed 
for each pair of cells across graphs constructed over the two spaces, 
to account for the underlying structure of the data (Supplementary 
Methods). Then, novoSpaRc optimally aligns the distances of pairs of 
cells between the expression data and geometric features of the physi- 
cal space, in a way that is consistent with spatial expression profiles 
of marker genes when these are available (Methods, Supplementary 
Methods). For reasons that are both biologically and computationally 
motivated, we seek a probabilistic mapping that assigns each cella distri- 
bution over locations onthe physical space (Supplementary Methods). 
We formulate this as a generalized optimal-transport problem’*”®, which 
has been proven to be increasingly valuable for diverse fields (including 
biology’”"®) and renders the task of reconstruction feasible for large data- 
sets. Specifically, we formulate an interpolation between entropically 
regularized Gromov-Wasserstein’’”° and optimal-transport” objectives, 
which serves to satisfy the assumption of structural correspondence 
between gene expression space and physical space, and to match prior 
knowledge when available (Methods). We show that this optimization 
problem can be efficiently solved using projected gradient descent 
reduced to iterations of linear optimal-transport sub-problems (Sup- 
plementary Methods). To systematically assess the performance of 
novoSpaRc, we used a simple generative model of spatial gene expres- 
sion to show that it can robustly recover it (Supplementary Methods, 
Extended Data Fig. 1b-d). 


novoSpaRc reconstructs tissues de novo 


Focusing on real single-cell datasets, we first reconstructed tissues 
de novo that have inherent symmetries that render them effectively 
one-dimensional, such as the mammalian intestinal epithelium’ and 
liver lobules”. Schematic figures of the reconstruction process are 
shown in Fig. 2a, e. Cells were previously classified into seven distinct 
zones for the intestine, or nine layers for the liver, on the basis of robust 
marker gene information””’. We found that the average pairwise dis- 
tances between cells in expression space increased monotonically with 
the pairwise distances in physical one-dimensional space (Fig. 2b, f), 
consistent with our structural correspondence assumption. 

We used novoSpaRc to embed the expression data into one dimen- 
sion. The embedded coordinates of single cells correlated well on 
average with their layer or zone memberships (Fig. 2c, g, Supple- 
mentary Methods). The median Pearson correlation coefficient for 
reconstructed expression patterns to original patterns for the top 100 
variable genes was 0.99 for intestine and 0.94 for liver (Supplementary 
Methods), and the fraction of cells that were correctly assigned up to 
one layer away from their original layer was 0.98 for intestine and 0.73 


for liver (Supplementary Methods, Extended Data Fig. 2a, b). novo- 
SpaRc captured spatial expression patterns of the top zonated genes 
and spatial division of labour within the intestinal epithelium—as well 
as within the layers of the liver lobules (Methods, Fig. 2d, h, Extended 
Data Fig. 3a, b), in which cells in different tissue layers perform different 
tasks and exhibit different expression profiles. For the intestine, varying 
the grid resolution to include either fewer or more embedded zones 
did not compromise the quality of the reconstructed expression pat- 
terns (Extended Data Fig. 3c), which shows the potential for increased 
resolution of single-cell-based relative to atlas-based embedding. 


novoSpaRc reconstructs early embryos 


Next, we focused on spatially reconstructing the well-studied Dros- 
ophila embryo, as amore-challenging, higher-dimensional tissue. Late 
instage 5 of development, the fly embryo consists of around 6,000 cells. 
It has been previously suggested” that at early stages of fly develop- 
ment, the expression levels of gap genes can be optimally decoded 
into positional information. The expression levels of 84 transcrip- 
tion factors were quantitatively registered using fluorescence in situ 
hybridization (FISH) for each of the cells by the Berkeley Drosophila 
Transcription Network Project (BDTNP)”. 

To assess the performance of novoSpakRc, we first simulated scRNA- 
seq data by in-silico dissociating the BDTNP dataset into single cells 
(Methods), and then attempted to reconstruct the original expression 
patterns across the tissue both de novo and by using marker genes 
(Fig. 3a). Similarly to the ‘one-dimensional’ datasets, we found a mono- 
tonically increasing relationship between the cell-cell pairwise dis- 
tances in expression space and in physical space (Fig. 3b), confirming 
that the data adheres to our structural correspondence assumption. 

The reconstructed patterns of spatial gene expression highly cor- 
related with the original ones (Fig. 3c). We found that the novoSpaRc 
reconstruction that incorporated both structural and marker gene 
information outperformed the reconstruction based on only the lat- 
ter, and that performance was saturated at two marker genes (Fig. 3c), 
independently of the marker genes used. As expected, the quality of 
the reconstruction increased with the number of genes used to provide 
structural information in expression space, and with the fraction of 
spatially informative genes (Supplementary Methods, Extended Data 
Fig. 4a, b). The majority of spatial patterns were recapitulated faithfully 
even when only a single marker gene was used (Fig. 3c, d). In addition, 
novoSpakRc identified the physical neighbourhoods from which cells 
originated when used de novo (up to inherent symmetries; see Sup- 
plementary Methods), and pinpointed their true locations (P< 0.05 
compared to random assignment) when a handful of marker genes 
were used (Fig. 3e, Extended Data Fig. 5a, b). 

We examined the expression patterns of four transcription factors that 
span the dorsal-ventral and anterior—posterior axes (Fig. 3d). The quality 
of the reconstructionimproved when applying the structural correspond- 
ence assumption (Supplementary Methods, Extended Data Fig. 5d). The 
denovoreconstruction correctly identified both axes of the embryo, and 
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Fig. 2| novoSpaRc successfully reconstructs complex tissues with effective 
one-dimensional structure de novo. a, e, The reconstruction scheme for the 
mammalian intestinal epithelium (a) and liver lobules (e). b, f, Demonstration 
of the monotonic relationship between cellular pairwise distances in 
expression and physical space for intestinal epithelium (b) and liver lobules (f). 
Distances are measured as weighted shortest paths along the graphs 
constructed over physical or expression spaces. Dataare mean¢+s.d. 

c, g, novoSpakRc infers the original spatial context of single cells of the intestinal 


the reconstructed portrait was remarkably similar to the original one 
(Fig. 3d). In general—because de novo reconstruction is performed with- 
out any prior information that would anchor the cells—the reconstructed 
configurationis similar up to global transformations (reflections, rotations 
andtranslations), relative to the respective axes of symmetry (Supplemen- 
tary Methods). Consequently, the resulting patterns of gene expression 
might be shifted or flipped relative to the expected ones. However, there 
are features ofa faithful reconstruction that we can test for, such that the 
reconstruction would be robust to small changes in the optimization 
parameters (Supplementary Methods, Extended Data Fig. 4i) and that 
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Fig.3|novoSpaRc accurately reconstructs the Drosophila embryo onthe 
basis of the BDTNP dataset”. a, FISH data are used to create virtual scRNA- 

seq data, which novoSpakRc inputs to reconstruct a virtual embryo. 

b, Demonstration of the structural correspondence hypothesis. Pairwise 
cellular distances in expression space increase monotonically with distancesin 
physical space. Dataare mean +s.d.c, novoSpaRc spatially reconstructs the 
Drosophila embryo with only one marker gene. The quality of reconstruction 
(measured by Pearson correlation with FISH data) increases with the number of 
marker genes and saturates at perfect reconstruction at two marker genes, 
when using both structural information and marker gene information (blue 
boxes). This outperforms reconstruction that relies only on marker gene 
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epithelium (c) and liver lobules (g) with high accuracy. Heat maps show the 
inferred distribution over embedded layers (rows) for the cells in each of the 
original layers (columns). d, h, novoSpaRc captures the spatial division of 
labour of averaged expression of genes that have arole inthe absorption of 
different classes of nutrients in the intestine (d) and the spatial expression 
patterns of a group of pericentral, periportal and non-monotonic genes inthe 
liver lobule (h). The expression level of each gene ind and his normalized to its 
maximum value. 


the embedding of cells onto the embryo would be relatively localized—as 
we would expect for a biologically meaningful embedding (Fig. 3e). This 
means that the distribution over locations that each cellis assigned should 
be localized, and indeed, the mean standard deviation of that distribution 
for all cells is significantly lower than that of a randomized embedding 
(Supplementary Methods, Extended Data Fig. 4j). Furthermore, we dem- 
onstrated that the results from novoSpaRc—as measured by correlation 
to observed imaging data and optimization error—were robust to opti- 
mization parameters and sources of noise, including partial sampling of 
cells, additive expression noise and dropouts (Extended Data Fig. 4c—h). 
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information (yellow boxes). The results are averaged for 100 different 
combinations of marker genes. For the box plots, the centre line is the median, 
box limits are the 0.25 and 0.75 quantiles and whiskers extend to +2.698 s.d. 

d, Visualization of the reconstruction results for four transcription factors. The 
original FISH data (first row) are compared to reconstruction by novoSpaRc 
that exploits both structural and marker gene information (using two marker 
genes and one marker gene) and reconstruction without any marker gene 
information (de novo). e, The original locations of three cells are compared to 
their respective reconstructed locations by novoSpaRc (using two marker 
genes and one marker gene). The expression patterns of the marker genes used 
for the results ind and e are shown in Extended Data Fig. 5c. 
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Fig. 4| novoSpaRc identifies spatial archetypes in the Drosophila embryo by 
using scRNA-seq data. a, Schematic overview. The expression patterns as 
reconstructed by novoSpaRc are compared with the BDTNP expression values. 
b, Reconstruction of the Drosophila embryo using scRNA-seq data. 
Distributions of gene-specific Pearson correlation coefficients reflect better 


We next used novoSpaRc to reconstruct the stage 6 Drosophila 
embryo by using ascRNA-seq dataset” (Fig. 4a). In that study, 84 marker 
genes were required to reconstruct a virtual embryo by distributing 
1,297 cells over 3,039 locations. When we used novoSpaRc with the 
combination of both structural information and the reference atlas, the 
accuracy of reconstruction increased with the number of marker genes, 
reaching high correlation (Pearson correlation coefficient, 0.74) with 
the FISH data (Fig. 4b, Extended Data Fig. 5e). The de novo, atlas-free 
reconstruction accurately separated the major post-gastrulation spatial 
domains (mesoderm, neurogenic ectoderm and dorsal ectoderm), 
as well as finer spatial domains (Fig. 4c, d). We clustered the recon- 
structed patterns of the highly variable genes and averaged to obtaina 
representative pattern for each cluster, which we term the ‘archetype’ 
(Methods, Supplementary Information). novoSpaRc identified numer- 
ous distinct spatial archetypes (Fig. 4c, d, Extended Data Fig. 6). We 
compared representative genes of each spatial archetype with FISH 
images to visually assess the accuracy of the spatial reconstruction. 
Gene patterns that were expressed through the anterior—posterior 
or the dorsal-ventral axis were largely recapitulated: typical genes of 
the mesoderm (dorsal ectoderm), such as twi and sna (zen and ush), 
were colocalized ventrally (dorsally) (Fig. 4c, d, right, middle). novo- 
SpaRc accurately captured localized spatial populations (Fig. 4c, d, 
left, Extended Data Fig. 6, archetype 5), whereas less-extensive spa- 
tial domains were reconstructed with varying degrees of accuracy 
(Extended Data Fig. 6). Note that within the de novo reconstruction, 
accurate localization entails global transformations, as described above 
(Supplementary Methods). 

Before proceeding to more complex tissues, we reconstructed the 
zebrafish embryo dataset° (Extended Data Fig. 7). Similar to the original 
seminal study, we mapped the cells onto the surface of a hemisphere 
consisting of 64 distinct locations. The resulting spatial expression pat- 
terns highly correlated to the experimentally verified ones; novoSpaRc 
reconstructed the zebrafish embryo by using only 15 marker genes 
(in contrast to the 47 genes that were previously required®) and the 
accuracy of the reconstruction increased with the number of marker 
genes (Extended Data Fig. 7, Methods). Furthermore—in contrast to 
previous reconstructions—no data imputation or other specialized 
preprocessing was necessary’. 
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reconstruction with increasing number of marker genes. c, Three of the spatial 
archetypes (1,3 and 9) that novoSpaRc identified in the Drosophila embryo. 

d, Representative genes for each of the spatial archetypes depicted inc. FISH 
data (left columns) are compared against the corresponding novoSpaRc 
predictions (‘virtual in situ hybridization’ (vISH); right columns). 
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novoSpaRc charts diverse complex tissues 


To further demonstrate the applicability of novoSpaRc to 
complex tissues, diverse sequencing technologies and different organ- 
isms, we used it to reconstruct slices of mammalian brain cerebellum” 
(Fig. 5), the mammalian kidney” (Extended Data Fig. 8) and a dataset 
of hundreds of individual Drosophila embryos” (Extended Data Fig. 9). 

The adult mammalian brain is a well-studied, highly differentiated 
and complex tissue. To benchmark the capabilities of novoSpaRc in 
reconstructing complex tissues, we used mouse cerebellum slices 
froma recently developed spatial transcriptomics technology”. The 
dataset of sagittal sections contained 46,376 locations, correspond- 
ing to a single cell or a few cells, with a median of 52 quantified tran- 
scripts per location. To provide enough information to novoSpaRc, we 
first coarse-grained the data by binning neighbouring locations. This 
resulted in 7,704 locations, witha median of 379 quantified transcripts 
per location (Methods, Fig. 5a). novoSpaRc successfully reconstructed 
the whole transcriptome, with a Pearson correlation coefficient of 0.5 
over all 15,878 genes when using 15 marker genes and 0.94 when using 
50 marker genes (Fig. 5b, Supplementary Methods). Spatial expres- 
sion patterns emerged when using only a few markers. For example, 
spatial positions of Purkinje cells were revealed by reconstructing 
with only five marker genes (excluding all genes exhibiting an abso- 
lute Pearson correlation coefficient with Pcp4 of 0.25 or higher). The 
signal improved markedly when more markers were included (Fig. 5c). 
The reconstructed cerebellum slices showed concordance with the 
original spatial gene expression for a large number of known cell-type 
marker genes (Fig. 5d). To illustrate the versatility of novoSpaRc, we 
further applied it to a coronal section of a brain cerebellum”’, with 
similar results (Fig. Se). 

Next, we used novoSpaRc to spatially reconstruct a single-cell data- 
set from whole kidney”, which is a complex tissue with stereotypical 
organization. Inthe absence of a reference atlas of gene expression, the 
reconstruction was performed de novo. We focused on six major cell 
types of the kidney (Extended Data Fig. 8) and mapped the cells onto 
atwo-dimensional target space. The de novo reconstruction recapitu- 
lated the urine flow within the kidney sub-compartments, as shown by 
the spatial gene expression of corresponding marker genes (Extended 
Data Fig. 8). We note that, as no prior information was required for this 
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Fig. 5| novoSpaRc reconstructs mouse cerebellum tissue. a, The original and 
the coarse-grained spatial expression of a marker of Purkinje cells (Pcp4) ina 
sagittal section of the cerebellum from direct spatial RNA sequencing”. b, The 
overall Pearson correlation between original gene expression and gene 
expression predicted by novoSpaRc increases markedly as more marker genes 
are used. The correlation when using only five marker genes is substantially 


reconstruction, this case demonstrates the applicability of novoSpaRc 
toa wide variety of medically relevant tissues. 

Finally, to show that novoSpaRc can reconstruct not only a prototypi- 
cal tissue but also individual samples, we used a dataset that captures 
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higher than that ofarandom mapping of cells to locations. Density plots 
contain values for all 15,878 genes. c, The spatial gene expression of Pcp4is 
visible with only five marker genes and is enhanced as more markers are used 
for the reconstruction. d, Examples of original and predicted expression for 
neuronal marker genes. Reconstruction was performed with 35 marker genes. 
e, novoSpaRc accurately reconstructs a coronal section of the cerebellum”. 


expression patterns in hundreds of individual Drosophila embryos”. 
In this dataset, the expression of four gap genes and four pair-rule 
genes was measured along the anterior—posterior axis for 101 and 
177 embryos, respectively, providing a distribution over expression 
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Fig. 6| novoSpaRc identifies spatially informative genes. a, Identifying 
spatially informative genes in the mammalian intestine and liver. We identify 
de novo (that is, with no marker genes used) the most highly zonated genes 
along the crypt-to-villus axis in the intestine (left) and across the axis of a liver 
lobule (right). The prediction of novoSpaRc is compared against the original 
expression patterns. The expression level of each gene is normalized to its 
maximum value. b-d, Identifying spatially informative genes in the Drosophila 
embryo (reconstruction with the BDTNP marker genes) and aslice of the 
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mammalian cerebellum (reconstruction with 50 markers), using a measure of 
spatial autocorrelation. b, Expression patterns of the top 15 spatially 
informative genes inthe Drosophila embryo. c, The spatial autocorrelation 
values (spatial information index) of the 84 transcription factors (TFs) chosen 
for the BDTNP dataset”? are among the highest values over all 8,924 genes of the 
fly embryo, demonstrating that they are identified to be highly spatially 
informative. d, Top 10 spatially informative genes (out of the top 1,000 variable 
genes) ina section of the cerebellum. 


patterns. We used novoSpaRc to reconstruct the expression patterns of 
the gap and pair-rule genes for individual embryos. For a given embryo, 
novoSpakRc reconstruction using a reference atlas based on the gene 
expression within the same embryo consistently outperformed recon- 
struction using a reference atlas based on the averaged gene expression 
across all embryos in the dataset (Extended Data Fig. 9)—yet reached 
high correlation values for both (median Pearson correlation coeffi- 
cients for reconstructing a fourth gene based on the three remaining 
genes were 0.99 (for expression within the same embryo) (0.95 for 
expression averaged across embryos) and 0.94 (0.77) for the gap and 
pair-rule genes, respectively). 

We examined the effect of the interpolation between structural and 
marker gene information, and evaluated the performance of novoSpaRc 
by comparing it to available reconstruction methods that fully rely on 
a reference atlas (Seurat® and DistMap”) (Extended Data Figs. 10, 11). 
novoSpaRc has several advantages when compared to the other exist- 
ing methods and overall shows substantial benefits in reconstruction 
performance (Extended Data Fig. 10, Supplementary Discussion). 


Identifying spatially informative genes 


AnovoSpaRc-based spatial reconstruction allows us to identify known 
and potentially new spatially informative genes directly from the single- 
cell sequencing data. For the intestine and liver datasets, we recovered 
highly zonated genes without a reference atlas (Methods, Supplemen- 
tary Information), and found that the top inferred zonated genes were 
supported experimentally and/or computationally (Fig. 6a, Supplemen- 
tary Tables 1, 2). Gene ontology enrichment analysis” further revealed 
that zonation-compatible biological processes enriched for different 
domains inthe intestine and the liver were reconstructed by novoSpaRc 
(Supplementary Information). For the Drosophila single-cell data- 
set, we ranked all 8,924 genes according to their spatially informative 
rank (Methods, Fig. 6b, Supplementary Information), and found that 
transcription factors were (as known from classic genetics”) among 
the most highly informative genes (Fig. 6c). In addition, novoSpaRc 
identified numerous long non-coding RNAs and transcription factors 
as being highly spatially informative, many of them already predicted in 
a previous study”. Finally, we ranked all 15,878 genes in the cerebellum 
by their spatially informative rank (Methods, Fig. 6d, Supplementary 
Information), and found that well-known marker genes witha defined 
pattern of spatial expression are indeed among the highest-ranking 
spatially informative genes (Fig. 6d). 


Discussion 


Together, we have demonstrated here that one can spatially reconstruct 
diverse biological tissues on the basis of a simple hypothesis about 
how gene expression is organized in space—a structural correspond- 
ence between the distances of cells in expression space and in physical 
space—and that it can be used to extract spatially informative genes. 
Our current implementation is based on pairwise comparison of cells 
and locations. This requirement can be readily altered. In fact, it is 
compelling to hypothesize that within certain biological contexts, 
different cell types may require higher-order interactions or exhibit 
different principles of spatial organization. Furthermore, we stress that 
because of the availability of general mathematical results in optimal- 
transport theory, our framework is versatile and can support a variety of 
alternative ways to compare distances in expression and physical space 
by varying the optimization loss functions (Methods, Supplementary 
Methods). Such alternative schemes are not currently supported by 
novoSpaRc, but could be implemented. 

Our data analyses and the success of the reconstructions by novo- 
SpaRc suggest that we have identified a general principle for how gene 
expression is organized in tissue space (Supplementary Discussion). It 
will be interesting to find tissues for which this organization principle 


is weak or not valid. However, this principle may be underestimated, as 
most of the single-cell data available are relatively shallow and noisy. 
Our data also suggest that many more genes than perhaps anticipated 
are involved in spatial features and functions (including physiology and 
pathophysiology) of tissue. We have demonstrated that we can system- 
atically identify at least a subset of these genes directly from single-cell 
data. In the future, we will extend these analyses to identify genes that 
are predicted to functionally interact in space. Finally, our developed 
framework can be flexibly extended beyond spatial reconstruction. We 
are currently using it to recover different types of biological signals, 
such as temporal progression on short (for example, cell cycle) and 
long (for example, developmental) timescales. 


Online content 


Any methods, additional references, Nature Research reporting sum- 
maries, source data, extended data, supplementary information, 
acknowledgements, peer review information; details of author con- 
tributions and competing interests; and statements of data and code 
availability are available at https://doi.org/10.1038/s41586-019-1773-3. 
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Methods 


Data reporting 

No statistical methods were used to predetermine sample size. The 
experiments were not randomized and the investigators were not 
blinded to allocation during experiments and outcome assessment. 


Data pre-processing 

For the cases for which normalized data was not available or used by 
the authors, we adopted the standard library size normalization in 
log-space, for example, if d; represents the raw count for gene iin cell 
Jj, we normalized it as 


d,>di,=l 5,4 
i? y= 108, 10 “Spa 


Highly variable genes were identified by plotting the dispersion of a 
gene asa function of its mean and selecting the outliers above cut-off 
values (usually 0.125 for the mean and 1.5 for the dispersion). 

In the Slide-seq datasets”, we summed up the transcriptomes of 
neighbouring cells by rounding the coordinates of the physical loca- 
tions to the next integer multiple of 50. This resulted in a total of 8,331 
(9,890) cells for the sagittal (coronal) section of the cerebellum. Low- 
quality locations were further filtered out by requiring at least 50 genes 
per cell, resulting in a total of 7,704 (8,258) for the sagittal (coronal) 
section. Marker genes for the reconstruction were randomly selected 
from the set of 747 genes. As one of the means of benchmarking the dif- 
ferent reconstructions was to visually assess the expression pattern of 
Pcp4, we ensured that no genes with a Pearson correlation of |R| > 0.25 
with Pcp4 were selected as marker genes. 


Mathematical formulation of novoSpaRc 

The procedure used by novoSpaRc includes several steps. We first 
compute the graph-based distance matrices for Nsingle cells in expres- 
sion space, D®? € R”*™ and for M locations, D'S < R™*” (Extended 
Data Fig. 1a, Supplementary Methods). Then, optionally, ifa reference 
atlas is available, we compute the matrix of disagreement, 
D*P-Phys = RN*M | between each of the cells to each of the locations, 
onthe basis of the inverse correlation between the partial expression 
profile for each location given by the reference atlas and the 
respective expression profile for each cell. Equipped with these meas- 
ures of intra- and inter-dataset distances, we set out to find an optimal 
(probabilistic) assignment of each of the single cells to cellular phys- 
ical locations. 

We formulate this problem as an optimization problem within the 
generalized framework of optimal transport °. Optimal transport is 
amathematical framework that was first established in the eighteenth 
century by Gaspard Monge and was initially motivated by the question 
of the optimal (minimal cost) way to rearrange one pile of dirt into a dif- 
ferent formation (the respective minimal cost is appropriately termed 
the ‘earth mover’s distance’). The framework evolved both theoretically 
and computationally"*” and was extended to the correspondence 
between pairwise similarity measures via the Gromov-Wasserstein 
distance’’”°. Thus, in our context, it allows us to build on these results 
and tools to feasibly solve the cellular assignment problem. 

We aim to finda probabilisticembedding, T € RY”, of Nsingle cells 
to Mlocations that would minimize the discrepancy between the pair- 
wise graph-based distances in expression space and in physical space, 
and-—if a reference atlas is available—simultaneously minimize the 
discrepancy between its values across the tissue and the expression 
profiles of embedded single cells. For each cell i, the value of 7; is the 
relative probability of embedding it to location. These optimization 
requirements over Tare formulated as follows. We measure the pairwise 
discrepancy of 7for the expression and physical spaces using the Gro- 
mov-Wasserstein discrepancy” 


D\T)= Y L(DPP, 


ij,kl 


h 
pes) TT 


where L is a loss function; specifically, we use the quadratic loss 
L(a, b)= 5 la- by. This term captures our preference to embed single 
cells such that their pairwise distance structure in expression space 
would resemble their pairwise distance structure in physical space. 
Intuitively, if expression profiles that correspond to cells iand k are 
embedded into cellular locations/jand /, respectively, then the distance 
between iand kin expression space should correspond to the distance 
between/and/in physical space (for example, ifiand kare close expres- 
sion-wise they should be embedded into close locations, and vice versa). 
The discrepancy measure weighs these correspondences by the respec- 
tive probability of the two embedding events. 

To measure the match to existing prior knowledge, or an available 
reference atlas, we consider 


D,( T)= Y Dy. phys Tayi 
ij 


This term represents the average discrepancy between cells and 
locations according to the reference atlas, weighted by T. Finally, we 
regularize 7 by favouring embeddings with higher entropy, where 
entropy is defined as 


H(T)=- > T,; log; 
ij=l 


J 


Intuitively, higher entropy implies more uncertainty in the map- 
ping. Entropic regularization drives the solution away from arbitrary 
deterministic choices and was shown to be computationally efficient”. 

Putting these together, we define the optimization problem for the 
optimal probabilistic embedding 7: 


T*=argmin(1-a@)D,(T) + aD,(T) - €H(T) 


subject to 


Y T;=P; Vie {l,...,N} 
Jj 
y T.j=4 vjefl,...,M} 
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where € is a non-negative regularization constant, anda €[0,1lisa 
constant interpolating between the first two objectives, and can be set 
toa=Owhennoreference atlas is available. The constraints reflect the 
fact that the transport plan 7 should be consistent with the marginal 
distributions p € \p eR; LiP;= 1} andgeé {q ERY; y; q,= i, over 
the original input spaces of expression profiles and cellular locations, 
respectively. 

These marginals can capture, for example, varying densities of single 
cells in the vicinity of different cellular grid locations, or the quality 
of different single-cell expression profiles (hence forcing low-quality 
single cells to havea smaller contribution to the reconstructed tissue- 
wide expression patterns). When such prior knowledge is lacking, p 
and q could be set to be uniform distributions. 

We derive an efficient algorithm for this optimization problem, 
inspired by the combined results for entropically regularized opti- 
mal transport” and mapping based on Gromov-Wasserstein distance 
between metric-measure spaces”° (Supplementary Methods). 

Then, given the original single-cell expression profiles, represented 
by a matrix Ye RS (for N single cells and g genes), and the inferred 
probabilistic embeddingT € RY (for Nsingle cells and Mlocations), 
we can derive a virtual in situ hybridization (vISH), S=Y7T © Ro" 


(for ggenes and M locations), which contains the gene expression val- 
ues for every cellular location of the target space. 

Note again that because our mapping is probabilistic, each of the 
cellular locations of the vISH does not correspond to a single cell in 
the original data. Rather, the vISH represents the expression patterns 
over an averaged, stereotypical tissue from which the single cells could 
have originated. 


novoSpaRc algorithm 
To spatially reconstruct gene expression, novoSpaRc performs the 
following steps: 
1. Read the gene expression matrix. 
la. Optional: select arandom set of cells for the reconstruction; 
1b. Optional: select a small set of genes (for example, highly 
variable). 
2. Construct the target space. 
3. Set up the optimal-transport reconstruction. 
3a. Optional: use existing information of marker genes, if available. 
4. Perform the spatial reconstruction including: 
4a. Assigning cells a probability distribution over the target space; 
4b. Deriving a vISH for all genes over the target space. 
The novoSpaRc package, system requirements, installation guide 
and demo instructions are provided at https://github.com/rajewsky- 
lab/novosparc. 


Generating in silico single-cell data for the BDTNP dataset 

To test the performance of novoSpakRc with single-cell resolution 
ground truth, we generated an in silico single-cell dataset for the BDTNP 
data”. In that case we have access to expression profiles for different 
locations across the embryo. We effectively dissociate the embryo by 
taking these expression profiles to be the expression profiles of single 
cells in our in silico set, masking their true original locations, and use 
novoSpaRc to reconstruct the original embryo (which may be done at 
lower spatial resolution). 


Identification of spatial archetypes 

The identification of spatial archetypes is performed by cluster- 
ing the spatial expression of a given set of genes. The gene expres- 
sion is first clustered by hierarchical clustering at the vISH level, 
although in principle different clustering methods can be used. The 
number of archetypes is chosen by visually inspecting the resulting 
dendrogram. The expression values of each gene of the cluster are then 
averaged per location to produce the spatial archetype for that cluster. 
Representative genes for each cluster are identified by computing the 
Pearson correlation of each gene within the cluster against the spatial 
archetype. The derivation of the spatial archetypes strongly depends 
on the set of genes used. We observed that the set of highly variable 
genes generally resulted in sensible spatial archetypes. A list of genes 
that correspond to each archetype is provided in the Supplementary 
Information. 


Identification of zonated genes 

For tissues with one-dimensional symmetry, we produce a ranking of 
highly zonated genes, both according to the original spatial expression 
patterns (Extended Data Fig. 2c, d) and the reconstructed patterns 
(Fig. 6a). 

The input is a spatial expression matrix (either original or recon- 
structed), specifying the expression level of each gene in each of the 
spatial zones. Then, to finda ranked list of genes that are highly zonated 
towards the first or last spatial zones (for example, cryptinthe liver), we 
first select all genes (i) whose highest expression occurs in that respec- 
tive zone; (ii) whose maximum expression value is in the top 1% of all 
genes; and (iii) that are statistically significantly zonated. To compute 
the zonation significance of individual genes, we used anon-parametric 
test based on the Kendall’s tau coefficient. The Kendall's tau coefficient 


is ameasure for the correspondence between two ranked lists—in our 
case, the expression values of a given gene over consecutive spatial 
zones and the numbering of the zones. Finally, the remaining genes 
are ranked according to their centre of mass. 

The lists of predicted zonated genes based on novoSpaRc’s recon- 
struction for the mammalian intestine and liver are available in the Sup- 
plementary Information. 


Gene ontology enrichment 

We used GOrilla for gene ontology (GO) enrichment analysis”, 
in which GO enrichment was computed on the basis of target and 
background lists of genes (Supplementary Methods). For both the 
target and background lists of genes, we selected genes that had a maxi- 
mum expression value in the top 10% of all genes. The target lists for 
genes that were zonated towards the boundaries of the one-dimensional 
spatial axes (crypt and V6 inintestine; layers 1 and 9 in liver) were further 
filtered to contain only genes that are statistically significantly zonated, 
as described in ‘Identification of zonated genes’. The background lists 
contained the corresponding complements of the target lists. 


Identification of spatially informative genes 

We use a Spatial autocorrelation measure to rank genes as spatially 
informative. Specifically, we use Moran’s/as a measure for global spatial 
autocorrelation. For each individual gene i, the Moran’s /score for its 
spatial expression, y,, over n cellular locations is: 


where z;=y, — y, y,isthe mean expression of genei, Sp = Dj; w, ;and w;, 
is a spatial weights matrix, which we base on a k-nearest neighbours 
graph for each cellular location (k=8). To calculate the Moran’s/score 
and the respective P values for different genes, we used the implemen- 
tation of PySAL, a Python spatial analysis library”’. 

The Moran’s / scores with their respective P values, based on novo- 
SpaRc’s reconstructions for all genes of the Drosophila embryo, 
zebrafish embryo and cerebellum, are available in the Supplementary 
Information. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The scRNA-seq datasets were acquired from the Gene Expression Omni- 
bus (GEO) database with the following accession numbers: GSE99457 
for the intestinal epithelium’, GSE84.490 for the liver’, GSE95025 for 
the Drosophila embryo”, GSE66688 for the zebrafish embryo° and 
GSE107585 for the kidney”. The cerebellum Slide-seq datasets” were 
acquired from the Broad Institute Single Cell Portal (https://portals. 
broadinstitute.org/single_cell/study/slide-seq-study). The individual 
Drosophila embryos dataset” is available as a supplementary informa- 
tion file of the original manuscript”. The BDTNP dataset was down- 
loaded directly from the BDTNP webpage”. 


Code availability 


A Python package for novoSpaRc, and the scripts for reconstructing 
selected tissues presented in the manuscript, are provided at https:// 
github.com/rajewsky-lab/novosparc. 
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Extended Data Fig. 1| Overview of probabilistic optimal matching using 
novoSpaRc and corresponding generative model. a, Based onthe raw data of 
single cellsin expression space and locations along a grid resembling the target 
tissue, graph structures are computed and distance matrices are derived from 
these graphs (Supplementary Methods). The two branches, and potentially 
areference atlas, are aligned using novoSpaRc, under our structural 
correspondence assumption (distance in expression space on average 
monotonically increases with distance in physical space) and by using 
probabilistic embedding (Supplementary Methods). b,c, Left, visualization of 
noisy expression patterns for three random genes in models for 1-dimensional 
(1D) (b) and two-dimensional (2D) (c) tissues. Right, the original expression 
pattern for arepresentative gene, its coarse-grained representation 


(decreased spatial resolution) and its reconstruction using novoSpaRc. d, The 
Pearson correlation of the reconstructed expression pattern datatothe 
original synthetic expression data increases with increasing signal-to-noise 
ratio, with the number of marker genes and with the fraction of informative 
genes, and exhibits non-monotonic behaviour with the a parameter. We note 
that ais an interpolation parameter (defined in the Methods section 
‘Mathematical formulation of novoSpaRc’) between using only areference 
atlas (a@=1) and using only structural information (driven by the structural 
correspondence assumption) (a=1). Results are averaged over 100 
instantiations of the generative model; data are mean +s.d. The generative 
modeland its default parameters are described in the Supplementary 
Methods. 
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Extended Data Fig. 2 | Evaluation of novoSpaRc reconstruction of the 
intestinal epithelium and the liver lobule. a, b, The fraction of cells inthe 
crypt-to-villus axis (a) and the liver lobule axis (b) that is correctly assigned to 
its corresponding original villus zone” and original lobule layer’, or is assigned 
toazoneup todzones away from the original zone (x axis), is substantially 
higher than that of random assignment.c, d, novoSpaRc reconstructs the 
spatial expression patterns of the top zonated genes in the intestinal 
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epithelium (c) (10 top zonated genes towards the crypt, and 10 top zonated 
genes towards V6) and in the liver lobule (d) (10 top zonated genes towards the 
central vein (CV), and 10 top zonated genes towards the portal node (PN)). 
2810417H13Rik is also known as Pclaf. The selection of the top zonated genesis 
described in the Methods. The expression level of each geneincand dis 
normalized to its maximum value. 
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Extended Data Fig. 3 |novoSpaRc reconstruction of the intestinal 
epithelium and the liver lobule is robust and consistent with changing grid 
resolution. a, b, Examples of FISH expression patterns of six zonated genes 
across the liver lobules, comparing the reconstructed (de novo vISH data) 
expression patterns produced by novoSpaRc to the expression patterns 
reported ina previous study’ (a), and the original (FISH) data (adapted from the 
same study’) (b). The visualization ina is a heat map, which shows the 
expression values of each gene across the lobule layers. The visualization of the 
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reconstructed vISH data in bis intended to be comparable to the FISH images, 
and therefore the 1D reconstructed coordinates are projected ontoa polar 
coordinate system (central vein-middle, portal node-outer circumference). 
c, The successful de novo reconstruction of the intestinal epithelium dataset” 
is achieved for varying numbers of layers used for the target space (including 
both lower and higher numbers of layers compared with the original number 
(seven) of reference layers). The expression level of each gene is normalized to 
its maximum value. 
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Extended Data Fig. 4 | novoSpaRc reconstruction of the Drosophilaembryo 
onthe basis of the BDTNP dataset is robust and self-consistent. a,b, The 
Pearson correlation of the reconstructed expression patterns to the original 
FISH expression data” increases with the number of genes used to construct 
the structural cellular graphin expression space (a), and with the fraction of 
those genes that are spatially informative (b). Spatially non-informative genes 
inthis case were simulated as random Gaussian variables with mean ands.d. 
comparable to that of the original set of genes. c-f, The Pearson correlation of 
the reconstructed expression patterns to the original FISH expression data” 
increases with the percentage of sampled single cells (without replacement) (c) 
and with the percentage of sampled single cells (with replacement) (d), and 
steadily decreases with noise level (e) and with the percentage of dropouts in 
the data (f). g, The mean value and variance of the optimization objective 
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function (which we aim to minimize) increase with noise level. The results in 
a-g are averaged over 100 randomchoices of two marker genes; dataare 

mean +s.d.h, The Pearson correlation of the de novo reconstructed expression 
patterns to the original FISH data varies gradually with the entropic 
regularization parameter ¢.i, The Pearson correlation of embedded de novo 
expression patterns of the BDTNP dataset” for different values of the entropic 
regularization parameter e with the expression pattern for €=5 x10 (vertical 
dotted line).j, The spatial s.d. of embedded cells over the Drosophila embryo of 
the BDTNP dataset derived from de novo reconstruction by novoSpaRc is 
significantly lower than the s.d. derived from randomized embedding 
(P<10°°, two-sided Kolmogorov-Smirnov test). Histograms show results for 
all 3,039 cells. 
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Extended Data Fig. 5 | See next page for caption. 


Extended Data Fig. 5 | novoSpaRc accurately reconstructs the Drosophila 
embryo onthe basis of the BDTNP dataset and single-cell data. a, Examples 
of mapping probabilities of single cells produced by novoSpaRc for the 
Drosophila embryo, using the BDTNP dataset”. The predicted spatial positions 
of cells are distributed over relatively many locations when reconstruction is 
done denovo, and are more localized when marker genes are used. 

b, Histogram of Euclidean distances between the original cellular location of 
single cells and the most likely location predicted by novoSpaRc using one and 
two marker genes, compared toa histogram for random spatial predictions. 

c, The expression patterns of the two marker genes and one marker gene that 
were used for the results presented ina, b and in Fig. 3d, e. d, Visualization of 


reconstruction results for four transcription factors. The original FISH data are 
compared to reconstruction by novoSpaRc that exploits both structural and 
marker gene information (using two marker genes and one marker gene), 

and reconstruction without any marker gene information (de novo). 
Reconstruction that uses both structural and marker gene information (ora 
reference atlas) outperforms reconstruction that is based solely onareference 
atlas. e, Visualization of novoSpaRc-based reconstruction results for the four 
transcription factors, based on single-cell data” that exploit both structural 
and marker gene information (using 10-80 marker genes). The results ina-d 
are based onthe BDTNP dataset”, and the results in e are based ona single-cell 
dataset”. 
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Extended Data Fig. 6 | novoSpaRc identifies spatially informative 
archetypes by using scRNA-seq data for the Drosophila embryo. The 
archetypes shown complement those of Fig. 4c, d. Preferred spatial 
positioning is denoted by colouring ranging from blue (low) to yellow (high). 
FISH images were taken from the BDGP database”. For genes for which an 
image was not available, DVEX” was used instead. Two representative genes are 


bun 


CG42762 


shown for each spatial archetype. novoSpaRc accurately groups genes 
expressed ina particular domain—for example, the subdomain of the 
mesoderm, whichis characterized by the transcription factor gcm (Archetype 
5)—whereas it does not capture the details of the fine expression patterns of 
pair-rule genes (Archetype 8). CG42666is also knownas prage. 
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Extended Data Fig. 7 | novoSpaRc reconstructs the zebrafish embryo. 

a, Histograms assessing the increase in the accuracy of novoSpaRc 
reconstruction (measured by the Pearson correlation with FISH data) with 
increasing number of marker genes. b, novoSpaRc reconstructs patterns of 
gene expression in the zebrafish embryo on the basis of only 15 marker genes, 
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and the results improve as the number of marker genes increases. Top row, FISH 
data (reproduced from ref. °); second row: Seurat predictions using 47 marker 
genes®; bottom three rows: novoSpakRc predictions using 15, 30 and 47 marker 
genes. The genes shown were not used in any of the reconstructions. 
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Extended Data Fig. 8 |novoSpaRc reconstructs a whole-kidney dataset 

de novo. a, Sketch of the major cell types that are reconstructed with 
novoSpaRc. b, Representative marker genes for each of the cell types shownin 
a. Top rows depict a rough positioning for each cell type in yellow-green; 
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bottom rows show the gene expression predicted by novoSpaRc inthe 
reconstructed tissue. Nphs1, podocytes; Nrp1, endothelial cells; Sic27a2, 
proximal tubule cells; Umod, loop of Henle; Pualb, distal convoluted tubules; 
Agqp2, collecting duct cells. Expression ranges from low (blue) to high (yellow). 
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Extended Data Fig. 9 |NovoSpaRc reconstructs single Drosophila embryos. 
a, e, The averaged original expression of four gap genes (a) and four pair-rule 
genes (e) is shown for 101and 177 individual Drosophila embryos, 
respectively”. Solid line, mean; dark shadow, s.d.; light shadow, minimum and 
maximum values over all embryos. b, f, Demonstration of the monotonic 
relationship between cellular pairwise distances in expression and physical 
space, consistent with the structural correspondence assumption. Data are 
mean +s.d.c,g, The Pearson correlation increases with the number of marker 
genes used by novoSpakRc for the reconstruction of the remaining genes 
(a=0.5) for both gap genes (c) and pair-rule genes (g). Using areference atlas 
that corresponds to the individual embryo being reconstructed (‘individual 


Original locations Original locations 


atlas’) results ina consistently higher reconstruction quality than using an 
averaged reference atlas over all embryos (‘averaged atlas’). Dataare 

mean +s.d.d,h, Examples of the reconstruction of the expression patterns 
across a single random embryo, in which the reconstruction of each of the four 
genes is performed using the three complement genes as a reference, for both 
gap genes (d) and pair-rule genes (h). Note that the reconstructed expression 
patterns presented ind, h were computed while the corresponding genein 
each case was not used for the reconstruction. The expression level of each 
geneina,d,e, his normalized to the maximum value over the mean expression 
ofallembryos. 
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Extended Data Fig. 10 | Comparison of spatial reconstruction with 
novoSpaRc versus available methods that fully rely on areference atlas. 

a, The Pearson correlation of the predicted versus the original spatial gene 
expression is shownasa function of the top 100 highly variable genes for the 
intestinal epithelium and liver datasets, or the number of marker genes used 
for the reconstruction for the BDTNP dataset, the Drosophila and zebrafish 
embryos and the brain cerebellum (84, 84, 45 and 745 genes, respectively). For 
the 1D datasets, the reconstructions are done de novo (withno reference atlas) 
and the existing baseline methods are inapplicable. For the liver, the last lobule 
layer was removed from the analysis, as only five cells were associated with it. 
For the 2D datasets, correlations are computed only for genes that were not 


Seurat DistMap  novoSpaRc 
v v v 
X X v 
xX X v 
v Vv X 
X X v 
X X v 
X v v 
xX Vv Vv 
v X v 


used for the reconstructions. Note that for the Drosophila embryo novoSpaRc 
outperforms DistMap”, and for the zebrafish embryo novoSpaRc performs 
comparably to or better than Seurat°—although those methods were 
developed and tailored for the Drosophila and zebrafish embryos, respectively, 
and the best-performing threshold was chosen for DistMap. For the box plots, 
the centre line is the median, box limits are the 0.25 and 0.75 quantiles and 
whiskers extend to +2.698 s.d. For the BDTNP dataset, the Drosophila and 
zebrafish embryos and the brain cerebellum, the results are shown for 100 
random choices of marker genes. b, The intrinsic characteristics of novoSpaRc 
compared against Seurat’ and DistMap”. 
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Extended Data Fig. 11| Reconstruction quality varies with the a parameter. reconstructing based only on reference marker genes, without taking the 
Reconstructions of the BDTNP dataset, the Drosophila and zebrafish embryos structural correspondence assumption into account. We note that ais an 
and the brain cerebellum, with varying numbers of marker genes used for the interpolation parameter (defined in the Methods section ‘Mathematical 
reconstruction and different values of the a parameter. The reconstruction formulation of novoSpaRc’) between using only a reference atlas (a=1) and 
quality is quantified by calculating Pearsoncorrelations betweenthe predicted using only structural information (driven by the structural correspondence 
and the original patterns of gene expression for all genes that were not used as assumption) (a=1). For the box plots, the centre line is the median, box limits 


markers for the reconstruction. The quality of the reconstructiondecreasesfor arethe0.25and 0.75 quantiles and whiskers extend to +2.698 s.d. Results are 
a=1inthe BDTNP and brain cerebellum cases, which corresponds to shown for 100 random choices of marker genes. 
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The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


a The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


— For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


[| For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 
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Policy information about availability of computer code 


Data collection No software was used for data collection. All data shown in the manuscript is already publicly available. 


Data analysis We wrote custom software code which is available online on Github (distributed under the MIT License, version 0.2.2, https:// 
github.com/rajewsky-lab/novosparc). The code is written in python and uses commonly used python libraries (numpy, matplotlib, 
sklearn, scipy, ot). To calculate spatial autocorrelation we used the implementation of PySAL (version 2.0.0), a Python spatial analysis 
library. We also used an implementation of the Gromov-Wasserstein transport method by Erwan Vautier (distributed under the MIT 
License). 
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Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


No datasets were generated during the current study. The single cell datasets analyzed for the current study were acquired from the GEO database with the 
following GEO accession numbers: GSE99457 for the intestinal epithelium, GSE84490 for the liver, GSE95025 for the Drosophila embryo, GSE66688 for the zebrafish 
embryo and GSE107585 for the kidney. The cerebellum Slide-seq datasets were acquired from the Broad Institute Single Cell Portal (https:// 
portals.broadinstitute.org/single_ cell/study/slide-seq-study). The individual Drosophila embryos dataset (Petkova, M.D., et al., Cell 2019) is available as 
Supplemental Information files of the original manuscript. The BDTNP dataset was downloaded directly from the BDTNP webpage (http://bdtnp.|bl.gov). 
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Haem is an essential prosthetic group of numerous proteins and a central signalling 
molecule in many physiologic processes’. The chemical reactivity of haem means 
that anetwork of intracellular chaperone proteins is required to avert the cytotoxic 
effects of free haem, but the constituents of such trafficking pathways are unknown**. 
Haem synthesis is completed in mitochondria, with ferrochelatase adding iron to 
protoporphyrin IX. How this vital but highly reactive metabolite is delivered from 
mitochondria to haemoproteins throughout the cell remains poorly defined**. Here 
we show that progesterone receptor membrane component 2 (PGRMC2) is required 
for delivery of labile, or signalling haem, to the nucleus. Deletion of PGMRC2 in brown 


fat, which has a high demand for haem, reduced labile haem in the nucleus and 
increased stability of the haem-responsive transcriptional repressors Rev-Erba and 
BACH. Ensuing alterations in gene expression caused severe mitochondrial defects 
that rendered adipose-specific PGRMC2-null mice unable to activate adaptive 
thermogenesis and prone to greater metabolic deterioration when fed a high-fat diet. 
By contrast, obese-diabetic mice treated with a small-molecule PGRMC2 activator 
showed substantial improvement of diabetic features. These studies uncover a role 
for PGRMC2 in intracellular haem transport, reveal the influence of adipose tissue 
haem dynamics on physiology and suggest that modulation of PGRMC2 may revert 
obesity-linked defects in adipocytes. 


A small molecule has recently been isolated that stimulated adi- 
pogenesis? by acting as a gain-of-function ligand for PGRMC2, a 
poorly characterized single-pass transmembrane protein localized 
in the endoplasmic reticulum and the nuclear envelope’ ®. PGRMC2 
belongs to the membrane-associated progesterone receptor 
(MAPR) family, the members of which share a non-covalent haem- 
binding domain’. Other MAPR proteins (such as PGRMCI1, neudesin 
and neuferricin) bind haem reversibly’. We found that PGRMC2 
also reversibly bound haem’. Of note, addition of haem boosts adipo- 
genesis, whereas inhibition of biosynthesis blocks differentiation”. 
The adipogenic effects of haem have been linked to the nuclear 
receptor Rev-Erba"™”, a transcriptional repressor with a dual role 
in adipogenesis: it is required early on, but it must be degraded for 
differentiation to proceed”. Haem is a ligand for Rev-Erba"*”’, and 
binding of haem leads to eventual Rev-Erba degradation. Notably, 
the adipogenic effect of the PGRMC2 activator was dependent on 
Rev-Erba signalling®, hinting that PGRMC2 activation may stimulate 
adipogenesis by facilitating haem delivery to the nucleus to induce 
Rev-Erba degradation. Here, we have examined a role for PGRMC2in 
intracellular haem mobilization. 


PGRMC2traffics mitochondrial haem 


PGRMC2 protein purified from Escherichia coliwas noticeably reddish 
in colour (Fig. 1a). Its spectrum revealed the Soret peak of haemopro- 
teins at 390-430 nm (Extended Data Fig. 1a), and liquid chromatog- 
raphy-mass spectrometry (LC-MS/MS) showed a 616.18-Da peak, 
corresponding to iron-protoporphyrin IX (Fig. 1b, Extended Data 
Fig. 1b, c), confirming that PGRMC2 co-purified with haem. To test 
the ability of PGRMC2 to transfer haem (a requirement for a haem- 
mobilizing chaperone), we incubated PGRMC2 with apohorserad- 
ish peroxidase (apo-HRP), an inactive form of the enzyme lacking its 
prosthetic haem. Incubation of apo-HRP with haemin or PGRMC2 
increased HRP activity, reflecting conversion of apo-HRP into active, 
haem-bound holoHRP (Fig. 1c)—thus indicating that PGRMC2 can trans- 
fer haem to other proteins. To test the ability of PGRMC2 to transfer 
haem to Rev-Erba itself, apo-Rev-Erba was incubated with PGRMC2, 
the mixture was separated by native electrophoresis, and the gel was 
stained for haem and protein. In-gel staining revealed haem bound 
to PGRMC2, but not to apo-Rev-Erba (Fig. 1d). By contrast, apo-Rev- 
Erba incubated with wild-type PGRMC2, but not with a haem-binding 
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Fig. 1| PGRMC2 controls the intracellular distribution of labile haem. 

a, Purified PGRMC2is similar in colour to haemoproteins. b, LC-MS/MS spectra 
of PGRMC2 and haemin standard. c, Peroxidase activity of apo-HRP with 
PGRMC2. PGRMC2, haemin, apo-HRP and apo-HRP plus protoporphyrin IX 
(PPIX) show no activity. Haemin served as positive control. Technical duplicates 
are shown. d, Native PAGE of wild-type and haem-binding mutant (3xM) PGRMC2 
and apo-Rev-Erba ligand-binding domain (LBD) alone or in combination stained 
in-gel for haem (top) or protein (bottom). Black arrows, PGRMC2; red arrows, 
Rev-Erba. Haemin (20 pM) served as a positive control. PGRMC2 3xM and apo- 
Rev-ErbaLBD show no haem staining. e, Differential spectroscopy of PGRMC2 
haem-binding domain with increasing amounts of ferric or ferrous (in presence 
of 10 mM dithionite) haemin. Titration curves represent differential absorbance 


mutant (PGRMC2(3xM), Extended Data Fig. 1d, e), showed haem 
staining, indicating transfer of haem from PGRMC2 to apo-Rev-Erba 
(Fig. 1d). Consistent witha role in serial trafficking’®, PGRMC2 displayed 
medium-low affinity for haem (dissociation constant, K,;=1.4 x10°M 
for ferricand 5.3 x 10° M for ferrous) (Fig. le). Total intracellular haem 
is the sum of haem covalently bound or nearly so as a cofactor, and 
labile—or signalling—haem, which is buffered by proteins and available 
for exchange and regulatory events‘. To assess the ability of PGRMC2 
to modulate subcellular labile haem levels, we transfected HEK293T 
cells with GFP-haemoprotein peroxidase fusion reporters targeted to 
mitochondria, endoplasmic reticulum, cytosol and nuclei” (Extended 
Data Fig. 1f). The activity of these reporters depends on the availability 
of labile haem in these compartments. Because intracellular labile 
haem may be derived from the medium or from endogenous synthesis, 
to examine the contribution of PGRMC2 to haem mobilization from 
either source, we used succinylacetone to block biosynthesis’ and 
haem-depleted fetal bovine serum (FBS) to minimize exogenous haem 
uptake. Measurements in control cells showed that the mitochondrial 
and nuclear labile haem pools were derived entirely from endogenous 
synthesis, for the activity of these reporters was fully blunted in cells 
treated with succinylacetone (Fig. 1f). By contrast, both endogenous 
and exogenous haem contributed to the cytosolic and endoplasmic 
reticulum labile haem pools (Fig. 1f). PGRMC2 depletion resulted in 
decreased reporter activity in mitochondria, nuclei and—to a lesser 
extent—the endoplasmic reticulum, indicating reduced presence of 
labile haem (Fig. 1f, Extended Data Fig. 1g). We next considered how 
PGRMC2Z, which is localized in the endoplasmic reticulum and nuclear 
envelope, might acquire haem. Notably, PGRMC1 forms a complex 
with ferrochelatase (FECH) in the mitochondrial outer membrane that 
controls haem release”. PGRMC1 and PGRMC2 also interact”, and 
both are present in mitochondria-associated membranes”. We noted 
that in primary brown adipocytes PGRMC2 interacted with PGRMC1, 


at 405 (ferric) and 400 (ferrous) nm. K, is expressed as mean +s.d. f, Peroxidase 
activity in HEK293T cells co-transfected with labile haem reporters and 
scrambled or Pgrmc2 siRNA, and exposed to succinylacetone (SA), haem- 
depleted FBS or both for 48 h. ER, endoplasmic reticulum. g, Endogenous 
PGRMC2 co-immunoprecipitates with endogenous PGRMC1in primary brown 
adipocytes. h, Peroxidase activity in HEK293T cells co-transfected with labile 
haem reporters and scrambled, Pgrmcl, or Pgrmcl and Pgrmc2 siRNA and treated 
as inf. The scrambled group is repeated from f. Inf, h, n= 6 biologically 
independent samples. Representative results from two (b, g) or three (a, c-f, h) 
independent experiments. Data are presented as mean +s.d.;*P< 0.05 and 

***P < 0.001 versus scrambled basal; determined by two-way analysis of variance 
(ANOVA) with multiple comparisons and Tukey’s post-test. 


but not with glyceraldehyde-3-phosphate dehydrogenase, which was 
recently designated a cytosolic haem chaperone” (Fig. 1g). No interac- 
tion between PGRMC2 and PGRMCI was detected when an antibody 
targeting the haem-binding domain was used, suggesting that PGRMC2 
interacts with PGRMC1at or near this region (Extended Data Fig. 1h). 
Depletion of PGRMC1 resulted in reduced labile haem in all subcellular 
compartments, probably reflecting its broader pattern of localization? 
(Fig. 1h). Dual knockdown of PGRMC1and PGRMC2 had no added effect 
(Fig. 1h, Extended Data Fig. 1g), suggesting that PGRMC2 acts down- 
stream of PGRMCI1 to traffic endogenously synthesized haem. These 
findings suggest a model in which mitochondria-bound PGRMCItrans- 
fers haem to endoplasmic-reticulum-bound PGRMC2Z, which delivers 
haem to proteins inthe endoplasmic reticulum and nucleus, including 
haem-responsive transcription factors such as Rev-Erba. 


PGRMC2 is required for thermogenesis 

To evaluate the importance of PGRMC2-mediated haem mobilization 
in vivo, we focused on adipose tissue. PGRMC2 is enriched in adipose 
depots, particularly brown adipose tissue (BAT) (Extended Data Fig. 2a, b). 
We generated adipose-specific PGRMC2-null mice—which we desig- 
nated PGRMC2 adipose tissue knockout (PATKO)—that lack PGRMC2 
onlyin mature adipocytes. To avoid compensation mechanisms, unless 
noted all procedures were conducted at thermoneutrality. PATKO mice 
adapted to 30 °C showed no difference in body weight or white adipose 
tissue (WAT) mass (Extended Data Fig. 2c) but had reduced BAT weight 
(Fig. 2a) relative to their wild-type littermates. Notably, the appearance of 
PGRMC2-deficient BAT was markedly altered, with loss ofits distinctive 
reddish colour (Fig. 2b). There was, however, no difference in expres- 
sion of brown adipocyte markers (Fig. 2c) and histological comparison 
failed to reveal any difference (Fig. 2d). These findings led us to test 
the functionality of PGRMC2-deficient BAT. Reflecting the minor role 
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Fig. 2| PATKO mice are sensitive to cold. a, b, Weight (a) and gross appearance 
(b) of BAT of chow-fed wild-type (WT) (n= 8) and PATKO (n= 9) mice maintained at 
30 °C. Scale bar, 1cm.c, Expression of thermogenic genes is decreased in PATKO 
BAT (wild typen=5; PATKOn=6). d, Haematoxylin and eosin (H&E) staining of 
BAT. Representative images from two independent experiments (n=5). Scale bar, 
100 ppm. e, PATKO mice are cold-intolerant (n =12). Challenge started at Zeitgeber 
time (ZT)5. f, Survival curves at 4 °C (homeothermia is at 31°C). g, PATKO BAT 
responds normally to adrenergic signalling (wild type, n= 4; PATKO, n=5). Pgc-la 
is also known as Ppargcla. Ina-g, nrepresents biologically independent samples. 
Data are mean +s.e.m. *P< 0.05, **P<0.01and ***P< 0.001; PATKO versus wild 
type. #P< 0.001; 30 °C versus 4 °C, determined by two-tailed Student’s f-test (a, 
c) or two-way ANOVA with multiple comparisons and Bonferroni's post-test (e, g). 


of BAT at thermoneutrality, PATKO mice were indistinguishable from 
wild-type mice in energy balance studies (Extended Data Fig. 2d). How- 
ever, in contrast to wild-type mice, which activated thermogenesis and 
preserved body temperature when exposed to cold (4 °C), PATKO mice 
rapidly became hypothermic and perished if not rescued (Fig. 2e, f). 
This total impairment of adaptive thermogenesis was not a result of 
reduced sympathetic stimulation, as the transcriptional response to 
noradrenaline remained intact (Fig. 2g), despite a modest decrease 
in plasma noradrenaline (Extended Data Fig. 2e). Plasma glucose dur- 
ing challenge was similar to that of wild-type mice, and non-esterified 
fatty acids were minimally reduced (Extended Data Fig. 2e). To confirm 
that the thermogenic defect was independent of noradrenaline lev- 
els, we used the B,-adrenergic receptor agonist CL316,243. Injection 
of CL316,243 elicited an immediate and sustained increase in oxygen 
consumption in wild-type mice; this response was significantly blunted 
in PATKO mice (Extended Data Fig. 2f). Further, consistent with our 
model of mitochondrial haem mobilization, adipose-specific PGRMC1 
and PGRMC2 double-knockout mice were also cold-sensitive, perhaps 
more sothan PATKO mice (Extended Data Fig. 2g). These findings stress 
the importance of the PGRMC1I-PGRMC2 haem-trafficking pathway for 
adaptive thermogenesis. 


Loss of PGRMC2 causes mitochondrial dysfunction 


To determine the basis of the defects of PGRMC2-null BAT, we measured 
total haem content and found it considerably reduced (about 60%) 
(Fig. 3a). To probe the origins of this difference, we quantified haem 
precursors and found reduced levels of succinyl-CoA and glycine, the 
substrates of 5’-aminolevulinate synthase 1 (ALAS1), the rate-limiting 
enzyme of haem biosynthesis (Extended Data Fig. 3a). Accordingly, 
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levels of 5-aminolevulinic acid, the product of ALAS1I, tended to 
decrease (Extended Data Fig. 3a). We also noted decreased expres- 
sion of Alas1 and Alas2 (Extended Data Fig. 3b), indicating that defects 
in biosynthesis contribute to decreased total haemin PATKO BAT. Iron 
content was the sameas in wild-type BAT, indicating that the reduced 
haem levels were not caused by iron deficiency (Fig. 3b) and suggest- 
ing that tissue haem uptake was unaffected. Of note, labile haem levels 
were significantly decreased in nuclei purified from PATKO (Fig. 3c) 
and PGRMC1 and PGRMC2 double-knockout BAT, which had a similar 
discoloured appearance to PATKO BAT (Extended Data Fig. 3c). Inthe 
nucleus, haem regulates the activity of several transcription factors 
that, upon binding haem, are ultimately degraded. These include Rev- 
Erba and the transcriptional repressor BACH1”**. Levels of Rev-Erba 
and BACH1 proteins were higher in PATKO BAT (Fig. 3d), indicating 
that reduced nuclear labile haem resulted in stabilization of these fac- 
tors. Accordingly, expression of Bmall (also knownas Arntl) and Fth1, 
targets of Rev-Erba and BACH, respectively, was reduced (Extended 
Data Fig. 3d). The circadian pattern of Rev-Erba mRNA expression” 
was not altered in PATKO mice (Extended Data Fig. 3e), suggesting that 
the increased Rev-Erba protein levels in PATKO BAT are probably the 
result of reduced degradation. RNA-sequencing (RNA-seq) analysis 
showed that among differentially expressed genes (DEGs) between 
wild-type and PATKO BAT (adjusted P< 0.05; 312 DEGs upregulated and 
236 DEGs downregulated) (Supplementary Table 1), haem and iron 
homeostasis genes were enriched (45 genes, 8.2% of DEGs versus 3.9% 
in the BAT transcriptome; P< 10”) (Fig. 3e). Enhancer analysis of down- 
regulated DEGs in PATKO BAT revealed an enrichment (P<10) of Rev- 
Erba and BACH1 and BACH2 motifs (Fig. 3f, Supplementary Table 2), 
consistent with altered regulation of haem-sensitive transcription. The 
majority of haem and iron-linked DEGs were present inthe three most- 
downregulated pathways, which relate to metabolic processes and 
energy generation and contain many mitochondrial proteins. Expres- 
sion of electron transport chain and tricarboxylic acid cycle genes 
(Extended Data Fig. 3f, g) was broadly decreased in PATKO BAT, and 
levels of all electron transport chain proteins analysed were notably 
lower (Fig. 3g, h). Further, PATKO BAT had substantially reduced lev- 
els of uncoupling protein 1 (UCP1) (Fig. 3h), a finding consistent with 
greater stability of Rev-Erba, which directly represses Ucp1’®. Beyond 
its role in uncoupling mitochondrial electron transport, UCP1 regu- 
lates mitochondrial integrity””. PGRMC2-null brown adipocytes have 
large, swollen mitochondria with few, disorganized cristae (Fig. 3i), 
indicating mitochondrial dysfunction. Indeed, mitochondria isolated 
from PATKO BAT had reduced basal and markedly reduced uncoupled 
respiration (Fig. 3j). These findings demonstrate that in the absence of 
PGRMC2 there is a lower level of labile haem in the nucleus, leading to 
changes in the haem-responsive transcriptome that cause mitochon- 
drial dysfunction. 


Endogenous haem controls mitochondrial function 


Primary brown PATKO adipocytes recapitulated these defects: they 
exhibited severely reduced respiratory capacity, a markedly blunted 
response to adrenergic stimuli without alterations in the transcriptional 
response to noradrenaline, and decreased levels of UCP1and electron 
transport chain proteins (Extended Data Fig. 4a—j). Similar, and perhaps 
greater, defects were noted in adipocytes deficient in both PGRMC1 and 
PGRMC2 (Extended Data Fig. 4k). The introduction of human PGRMC2 
into mouse PGRMC2-null brown adipocytes restored mitochondrial 
bioenergetics and UCP1 levels, whereas expression of aPGRMC2 haem- 
binding mutant did not, indicating that these defects are related to 
the ability of PGRMC2 to mobilize haem (Extended Data Fig. 4l-o). 
Notably, mirroring the effect of PGRMC2 deletion, inhibition of haem 
synthesis was sufficient to impair mitochondrial function and deplete 
UCP1 in wild-type cells (Extended Data Fig. 5a—d). Neither depletion 
of exogenous haem nor addition of haemin affected mitochondrial 
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Fig. 3| PGRMC2 regulates haem-sensitive transcription and mitochondrial 
function in BAT. a, b, Total haem (a; wild type, n = 6; PATKO, n=8) andiron 
(b;n=8) levels in BAT. c, Labile haem in mitochondrial and nuclear fractions of 
BAT (wild type, n=5; PATKO, n=8).d, Rev-Erba and BACH1 levels in BAT. 

e, Genes related to haem and iron metabolism (red portions) are enriched in 
DEGs. f, Rev-Erba- and BACH1- and BACH2-binding motifs are enriched in genes 
downregulated in PATKO BAT. g, Heat map of haem- and iron-related genes shows 
a global decrease of electron transport chain (ETC) and tricarboxylic acid 


respiration in wild-type or PATKO cells (Extended Data Fig. 5a-e). 
These observations show that PGRMC2-dependent mobilization 
of endogenous haem regulates mitochondrial function in brown 
adipocytes. Lastly, we found that both Rev-Erba and BACHI proteins 
were more abundant in PATKO cells (Extended Data Fig. 5f). Dual 
knockdown of these factors restored basal respiration in PGRMC2- 
null adipocytes (Extended Data Fig. 5g, h), indicating that they are 
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cycle (TCA) gene expression. h, UCP1 and oxidative phosphorylation (OXPHOS) 
proteins are reduced in PATKO BAT. i, Electron microscopy shows altered 
mitochondrial morphology in PATKO BAT. Representative images from four 
biologically independent samples. j, Oxygen consumption rate (OCR) of 
mitochondria isolated from BAT (n= 6). Ina-j, n represents biologically 
independent samples. Representative results from two (a-c,j) or three (d, h) 
independent experiments. Data are mean +s.e.m. *P< 0.05, **P< 0.01land 

**P< 0.001 versus wild type; by two-tailed Student’s t-test. 


key mediators of the transcriptional response to haem and its effect 
on mitochondrial function. 


Adipose PGRMC2 regulates systemic metabolism 


We next gauged the importance of adipose PGRMC2 for glucose homeo- 
stasis. PATKO mice housed at room temperature and fed a high-fat diet 
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Fig. 4|PGRMC2 controls systemic glucose homeostasis. a, Blood glucose 
(wild type, n=9; PATKO, n=10) and insulin (n = 6) in wild-type and PATKO mice 
onHFD.b, c, Glucose tolerance test (GTT) (b) and insulin tolerance test (ITT) (c) 
after 10 (GTT) and 12 (ITT) weeks of HFD (GTT: wild type, n=8, PATKO, n=11; 
ITT: wild type, n=6, PATKO,n=9).d, Ucp1 mRNA in BAT of HFD-fed wild-type 
(n=9) and PATKO (n=10) mice. e, Glucose and insulin levels in DIO mice treated 
with vehicle or CPAG-1 for 30 days (n=7).f, g, GTT (f) and ITT (g) in DIO mice 
after 14 (GTT) and 20 (ITT) days of treatment (n=7).h, H&E staining of BAT. 


Representative images from four biologically independent samples. i, Ucp1 
mRNA levels in BAT of treated DIO mice (n=7).j, UCP1 and Rev-Erba levels in 
BAT of treated DIO mice. k, Nuclear labile haem levels in BAT of DIO mice 
treated with CPAG-1 for four days (n=4).Ina-k, nrepresents biologically 
independent samples, representative results from two independent 
experiments. Data are mean¢+s.e.m. *P<0.05,**P<0.01land ***P< 0.001 versus 
wild type or vehicle; two-tailed Student’s t-test (a, d, e, i, k) or two-way ANOVA 
with multiple comparisons and a Bonferroni's post-test (b,c, f, g). 
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(HFD) showed no differences in body weight or composition, except 
for decreased BAT mass (Extended Data Fig. 6a, b). However, they had 
higher fasting glycaemia (Fig. 4a) and decreased glucose tolerance and 
insulin sensitivity (Fig. 4b, c). They also exhibited hyperlipidaemia and 
exacerbated liver steatosis (about 70% more triglycerides) (Extended 
Data Fig. 6c-e), factors that probably increased insulin resistance. The 
BAT of HFD-fed PATKO mice showed no histological abnormalities 
(Extended Data Fig. 7a) but had substantially reduced Ucp1 expression 
(approximately 40% less) (Fig. 4d). Expression of Bmall and Fth1 was also 
decreased (Extended Data Fig. 7b). Analysis of WAT depots did not show 
extensive differences in adipocyte size, immune cell infiltration or gene 
expression ininguinal or epididymal WAT (Extended Data Fig. 7c—-e). Nota- 
bly, Bmall expression was reduced in PATKO inguinal WAT (Extended Data 
Fig. 7e). We propose that hastened metabolic deterioration in HFD-fed 
PATKO mice probably reflects the aggregate of defects in BAT and WAT. 


PGRMC2 activation mitigates metabolic disease 


The deleterious effects on metabolism of adipose PGRMC2 deletion 
suggest that activation of PGRMC2 function might reverse features 
of metabolic syndrome. Thus, we treated diet-induced-obese (DIO) 
mice at room temperature with a small-molecule PGRMC2 activator 
(compound 27 in ref. >; hereafter referred to as CPAG-1). CPAG-1 treat- 
ment had no effect on weight or food intake (Extended Data Fig. 8a), but 
treated mice had reduced fasting glycaemia and insulin levels (Fig. 4e) 
and improved glucose tolerance and insulin sensitivity (Fig. 4f, g). BAT 
histology showed decreased lipid content and an increase in multi- 
locular adipocytes (Fig. 4h), features indicative of improved function. 
Expression of Ucp1 and Bmall was also upregulated (Fig. 4i, Extended 
Data Fig. 8b), changes suggestive of reduced levels of Rev-Erba. Indeed, 
Rev-Erba protein was decreased and UCP1 protein was increased in BAT 
of CPAG-1-treated mice (Fig. 4j). Labile haem in the nucleus of brown 
adipocytes from CPAG-1-treated mice was significantly increased within 
four days of treatment (Fig. 4k), suggesting that decreased Rev-Erba 
protein was probably the result of haem-induced degradation. No 
histological differences were found in inguinal WAT (Extended Data 
Fig. 8c), but expression of Ucp1 and Pgc-la was increased (Extended 
Data Fig. 8d). Histology revealed a marked improvement in epididymal 
WAT, with fibrosis and inflammation noticeably decreased (Extended 
Data Fig. 8e, f). The liver of CPAG-1-treated mice appeared slightly less 
steatotic and expression of gluconeogenic genes and Tnfa was reduced 
(Extended Data Fig. 8g, h). CPAG-1 treatment also increased hepatic 
nuclear labile haem levels (Extended Data Fig. 8i). Given that CPAG-1 
interacts very weakly with PGRMCLI> (Extended Data Fig. 9), we suggest it 
may act primarily through PGRMC2to increase haem flux to the nucleus. 


Discussion 


Inthis study we have described a role for PGRMC2 in transport of mito- 
chondrial haem. In the absence of PGRMC2Z, less labile haem reaches the 
nucleus, resulting in alterations in haem-sensitive transcription that 
cause mitochondrial dysfunction in brown adipocytes (Extended Data 
Fig. 10). These defects compromise not only the primary function of 
BAT (preservation of normal body temperature), but also its contribu- 
tion to systemic glucose homeostasis. Given its high expression across 
white fat depots, further studies will be needed to determine whether 
PGRMC2 performs a similar role in WAT. Nevertheless, our findings pro- 
vide a view of how haem dynamics in adipocytes can affect physiology. 
Haem levels and expression of biosynthetic enzymes are reduced in vis- 
ceral fat of obese humans”’, stressing the link between adipocyte haem 
homeostasis and metabolic disease. Because PGRMC2is restricted inits 
tissue distribution, additional haem chaperones probably remain to be 
discovered. Finally, we have shown that pharmacological activation of 
PGRMC2 may be of use in treating metabolic disease. Given the interest 
inidentifying signalling pathways that enhance adipocyte function and 
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correct obesity-linked adipose tissue defects”, our findings suggest 
that modulation of intracellular haem dynamics could bea potentially 
innovative therapeutic strategy. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized. The investigators were not blinded 
to allocation during experiments and outcome assessment. 


Reagents 

Haemin, protoporphyrin IX, oligomycin A, carbonyl cyanide 4-(tri- 
fluoromethoxy) phenylhydrazone (FCCP), rotenone, antimycin A, 
3-isobutyl-1-methylxanthine (IBMX), BSA, mannitol, noradrenaline, 
isoproterenol, 8-Br-cAMP and succinylacetone were purchased from 
Sigma-Aldrich. CL 316,243 was obtained from Cayman Chemical. For- 
skolin was obtained from Chem Impex International. Insulin (Novo- 
lin) was purchased from Novo-Nordisk. Complete EDTA-free protease 
inhibitor cocktail was obtained from Roche. DMEM and other Gibco- 
branded cell culture products were purchased from Thermo Fisher. 
Haem-depleted FBS was prepared by treating FBS with 20 mM ascorbic 
acid for 16h, followed by 24 h dialysis against PBS. Haem depletion was 
verified by measuring optical absorbance at 405 nm. CPAG-1 was synthe- 
sized as previously reported’. ON-TARGET siRNA SMARTpools against 
human PGRMC2 (L-010639-00-0005), PGRMCI (L-010642-00-0005), 
and mouse Nr1d1 (L-051721-00-0005), and BACH (L-042956-01-0005), 
as well as a Non-targeting Pool (D-001810-10-05) were purchased from 
Dharmacon. HEK293T cells were obtained from ATCC (CRL-3216) hav- 
ing undergone short-tandem repeat verification. Cells were routinely 
tested for mycoplasma and were never positive. 


Protein production 

Full-length PGRMC2 ina bacterial expression vector (GenBank Acces- 
sion number NM_027558; Genecopoeia Ex-Mm25103-B01) was trans- 
formed into chemically competent BL21(DE3) cells (Thermo Fisher) 
and grown at 37 °C to an OD,o, of 0.8. Cells were induced with 1mM 
IPTG and grown for 12 h at 30 °C. Cells were collected and stirred at 
room temperature in 50 mM Tris-HCl, 150 mM NaCl pH 8.5 contain- 
ing 1% Triton X-100, 100 pg/ml lysozyme, 100 pg/ml DNase I, 10 mM 
MgCl,,and 10 mM CaCl, and 1x Complete EDTA-free protease inhibitor 
cocktail (Roche) for 1h. After sonication, the lysate was centrifuged at 
6,000g for 30 min and the supernatant purified using nickel affinity 
chromatography. After elution, the protein was dialysed into 50 mM 
Tris-HCI, 150 mM NaCl, pH 7.4, and purified by HiLoad 16/600 Superdex 
75 size exclusion chromatography (GE Healthcare). Amouse PGRMC2 
haem-binding mutant was created by mutating 3 amino acids (Y131F, 
K187A and Y188F) using a Quikchange II XL Site-Directed Mutagenesis 
Kit (Agilent), verified by DNA sequencing, and expressed and purified 
as described above. To generate the PGRMC2 cytochrome b5 haem- 
binding domain, residues 102-209 of human PGRMC2 were codon- 
optimized, synthesized (Integrated DNA Technologies) and inserted 
into the pET21a vector. The plasmid construct was transformed into 
BL21(DE3) cells (Thermo Fisher). Cells were grown at 37 °C to an OD¢o9 
of 1.0 and induced with 1 mM IPTG for 5h, collected and resuspended 
in 50 mM Tris-HCl, 1 mM EDTA, 0.01% NaN,, 1 mM DTT, 25% sucrose, 
and lysed in SO mM Tris-HCl, 1mM EDTA, 0.01% NaN,, 1 mM DTT, 200 
mM sodium chloride, 1% sodium deoxycholate and 1% Triton X-100. To 
isolate inclusion bodies, lysed cells were centrifuged at 6,000g for 20 
min and washed extensively with 50 mM Tris-HCI, 1 mM EDTA, 0.01% 
NaN;,1mM DTT, 25% sucrose, 100 mM sodium chloride and 0.5% Triton 
X-100. Inclusion bodies were subjected to a final wash in the same buffer 
without Triton X-100. To denature inclusion bodies, about 200 mg of 
inclusion bodies was resuspended in 100 mM Tris-HCl, 6 M guanidinium 
chloride and 20 mM B-mercaptoethanol for 1h at room temperature 
(RT). Denatured inclusion bodies were refolded overnight at 4 °Cinan 
oxidative refolding buffer containing 400 mM L-arginine, 100 mM Tris- 
HCI,5mM reduced glutathione, 0.5 mM oxidized glutathione, 10 mM 
EDTA and 200 mM phenylImethylsulphony!l fluoride. Refolded protein 
was concentrated and purified using HiLoad 16/600 Superdex 75 size 


exclusion chromatography (GE Healthcare). Purity of PGRMC2 proteins 
was confirmed using SDS-PAGE. Soret and a, B absorption spectra were 
measured on a SpectraMAX 250 reader (Molecular Devices) at room 
temperature. Purified PGRMC2 protein was incubated for 15 min with 
10 mM dithionite to reduce the haem group. Human REV-ERBa LBD 
(residues 281-614) with an N-terminal hexahistidine tag and atobacco 
etch virus (TEV) protease cleavage site was inserted into a pET46 vector 
and expressed in £. coliBL21(DE3) cells. Cells were grown in at 37 °C over- 
night and induced in autoinduction medium at 37 °C for 5h, 30 °C for 
1h, and 22 °C for 16h. Cells were collected and pellets stored at -80 °C. 
Pellets were thawed on ice and resuspended in lysis buffer without 
imidazole (40 mM NaHPO,, pH 7.4, 500 mM NaCl, 10% glycerol, 2.5 mM 
DTT and 0.1% Tween-20) at 40 ml buffer per 5 g pellet. The cell slurry 
was sonicated on ice in 15s on/30s off intervals (75% amplitude) for 5 
min total. Lysed cells were centrifuged at 14,000 rpm for 30 min at 4 °C. 
The supernatant was filtered through a 0.4-"um PES membrane Nalgene 
Rapid-Flow bottle-top filter and affinity purified using 2 x 5 ml HisTrap 
IMAC columns (GE Healthcare) affixed to an Akta Start. After loading, 
columns were washed with 100 ml wash buffer (40 mM NaHPO,, pH 
7.4,500 mM NaCl, 10% glycerol, 15 mM imidazole and 1mM DTT). The 
protein was eluted using a10 column-volume elution gradient with 
elution buffer (40 mM NaHPO,, pH 7.4, 500 mM NaCl, 10% glycerol, 
500 mM imidazole and 1mM DTT). The protein was eluted after >50% 
elution buffer then pooled and dialysed in 10-kDa MWCO SnakeSkin 
dialysis tubing (Thermo Fisher) for 24 h at 4 °C to remove imidazole 
and bound haem in 21 dialysis buffer (40 mM NaHPO,, pH 7.4,500 mM 
NaCl, 10% glycerol, 10 mM DTT, 0.1% Tween-20 and 0.5 mM EDTA). 
After dialysis, the protein was concentrated using a30 kDa MWCO 
Amicon Ultra centrifugal concentrator (EMD Millipore). The protein 
was further purified by size exclusion chromatography (Akta Pure) 
using a Superdex 75 10/300 GL column in gel filtration buffer (20 mM 
NaHPO,, pH 7.4,50 mM NaCl, 50 mM L-arginine, 50 mM L-glutamate and 
0.5mM EDTA). The protein was pooled and confirmed to be >90% pure 
by LC-MS and SDS-PAGE. The fraction of final purified 6xHis-REV- 
ERBa LBD bound to haem was assessed using the extinction coefficient 
for the haem Soret peak (€,,;) of 101.85 = 1 mM, and 6xHis-REV-ERBa 
LBD was confirmed to be >95% haem-free. 


Haem titration assay 

The affinity of PGRMC2 cytochrome b5 haem-binding domain for ferric 
and ferrous haem was measured by spectroscopy of the UV-visible spec- 
trum in the Soret region using a SpectraMAX 250 reader. Sequential 
aliquots of haemin in DMSO were added to the sample well containing 
10 uMapo-PGRMC2 and the reference well to obtain a 2-uM increment 
of haemin concentration per addition. Spectra were recorded 3 min 
after each addition of haemin. The difference in absorbance at 420 
nm was plotted in relation to haemin concentration, and dissociation 
constants (K,) calculated with GraphPad Prism 6 using a quadratic 
binding equation. 


Haem transfer assay 

Twenty-five microlitres of 200 nM apo-HRP (Calzyme Laboratories) 
was incubated with 25 ul of 5 uM purified PGRMC2 protein. After 5 
min at room temperature, 150 ul of BioFX TMB One Component HRP 
microwell Substrate (Surmodics) was added to wells and absorbance 
at 405 nm measured immediately for 15 min. As a positive control, 
apo-HRP was incubated with 0.3 nM haemin for 5 min and absorbance 
measured as described. 


Native PAGE and in-gel haem staining 

Haem transfer was assessed by mixing 10 pg of wild-type or mouse 
PGRMC2 haem-binding mutant (3xM) with 10 pg of apo-REV-ERBa 
protein and incubating for 30 min at 37 °C. After incubation, 2x Native 
Tris-Glycine sample buffer (Life Technologies) was added and samples 
separated by electrophoresis using Novex Tris-Glycine 4-20% gels and 
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Tris-Glycine Native Running Buffer (Life Technologies) for 6h. The gel 
was washed for 10 min with water and haem staining was performed 
using the BioFX TMB One Component HRP Microwell Substrate (Sur- 
modics). After imaging the haem stain, the gel was washed overnight 
with water and counterstained with Coomassie for protein detection. 


Mass spectrometry 

To detect haem in purified PGRMC2 protein, 5 pl of 20 mg/ml PGRMC2 
were extracted with 1 ml of Folch solution (2:1 chloroform:methanol) 
and washed with 200 ul of water. The extraction solution was then 
vortexed and centrifuged at 1,000g, 4 °C for 10 min and the lower phase 
extracted and dried down. Before LC-MS analysis, the sample was 
reconstituted in methanol. A haemin standard solution was prepared 
at 10 uM. LC-MS analysis was performed on an I-class UPLC system 
coupled with a Synapt G2-Si mass spectrometer via an electrospray 
ionization (ESI) source from Waters. The positive-mode (+) ESI condi- 
tions were as follows: capillary, +3.00 kV; sampling cone, 40 V; source 
temperature, 100 °C; desolvation temperature, 250 °C; desolvation 
gas flow, 600 I/h; and cone gas flow, 50 I/h, respectively. Leucine- 
enkephalin (m/z 556.2771) was used for lock mass correction. Liquid 
chromatography was performed with A = 40:60 water:acetonitrile + 
1mM ammonium formate, B= 90:10 2-propanol:acetonitrile. A Waters 
ACQUITY UPLC BEH C18 column (1.7 ppm, 2.1 mm x 100 mm) was used 
at a flow rate of 250 pl/min. Initially, the mobile phase composition 
consisted of 32% B and held for 1 min after injection and its composition 
was increased over the length of the gradient (15 min, B= 97%) in short 
increments adapted froma previous study” The injection volume was 
2 ul. For haem quantification in tissue, BAT was isolated from wild-type 
and PATKO mice housed at 30 °C after 10 min of perfusion with cold 
PBS. Ten to twenty-five milligrams of frozen tissue was homogenized 
in 300 pl of 1% formic acid in dH,O and an internal standard added. 
Haem was extracted in Folch solution (2:1 chloroform:methanol). After 
centrifugation at 4,000g for 10 min at 4 °C, haem was re-extracted from 
the organic phase with 1 volume of 1.4 N NaOH. Samples were centri- 
fuged at 4,000g for 10 min at 4 °C and the aqueous phase collected for 
mass spectrometry analysis. Haemin was quantified on an Agilent 6495 
triple quadrupole with a jet stream source coupled to an Agilent 1290 
UPLC. As internal standard, deuteroporphyrin (Frontier Scientific) was 
used and the monitored transitions were m/z 616.1 > 557.1 (quantita- 
tive), m/z616.1> 498.2 (qualitative) for haemin, and m/z564.0 > 505.0 
for deuteroporphryin. Jet stream was set at gas temperature 200 °C, 
gas flow 12 |/min, nebulizer pressure 30 psi, sheath gas temperature 
325 °C, sheath gas flow101/min, cap V=400 V, nozzle V =2,000 V. Liquid 
chromatography was performed with A = 90:10 water:methanol + 0.1% 
ammonium hydroxide and + 10 mM ammonium formate, B= 65:30:10 
2-propanol:methanol:water + 0.1% ammonium hydroxide and + 10 mM 
ammonium formate. All solvents were LC-MS grade. An Agilent extend- 
C18 column (1.8 pm, 2.1x 50 mm) was used at a flow rate of 0.2 ml/min. 
Initially, the mobile phase consisted of 5% B and, after injection, its 
composition increased linearly to 95% B in 6 min and held at 95% for 
3 min. The injection volume was 5 pl. Haem content was normalized 
per milligram of tissue. Quantitative analysis of glycine, aminolevulinic 
acid and succinyl-CoA was performed using aQQQ mass spectrometer 
operated in positive-ion mode (Xevo TQ-XS from Waters). In brief, 
10 mg of frozen BAT was homogenized with ice cold 80% methanol 
and glass beads and incubated on ice for additional 10 min. The tissue 
lysate was centrifuged at 18,000g for 10 min at 4 °C and split into two 
aliquots followed by drying downina vacuum concentrator and stored 
at -80 °C before LC-MS/MS analysis. For glycine and aminolevulinic 
acid analysis, an aliquot was reconstituted in 1:1 acetonitrile:water 
and injected into a Waters ACQUITY UPLC BEH Amide column (1.7 pm, 
2.1mm x 100 mm) ata flow rate of 400 l/min. The mobile phases con- 
sisted of A=water + 0.1% formic acid and B = acetonitrile + 0.1% formic 
acid. Initially, the mobile phase composition consisted of 95% B and 
held for 1 min after injection and its composition was decreased to 65% 


over 6 min and then to 40% over 3 minand held for an additional 1 min. 
The following quantifier and qualifier transitions (collision energy 
in eV) were used for each metabolite: glycine: 76.0 > 30.3 (6 eV) and 
48.2 (4 eV); °C-glycine: 78.0 > 31.0 (6 eV) and 49.0 (4 eV); aminolevulinic 
acid: 132.2 > 55.1(18 eV), 68.3 (18 eV), 86.0 (10 eV), 114.0 (6 eV). For suc- 
cinyl-CoA, an aliquot was reconstituted in 50 mM ammonium acetate 
(pH 6.8 adjusted with ammonium hydroxide) and analysed as soon as 
possible once samples had been reconstituted to avoid degradation® **. 
Liquid chromatography was performed with A =50 mM ammonium 
acetate (pH 6.8) and B= 80% methanol. A Waters ACQUITY UPLC BEH 
C18 column (1.7 pm, 2.1mm x 100 mm) was used at a flow rate of 250 
pl/min. Initially, the mobile phase composition consisted of 2% B and 
held for 1.5 min after injection and its composition was increased to 
15% over 1.5 min and then to 95% over 1.5 min and held for 9 min. The 
following quantifier and qualifier transitions (collision energy in eV) 
were used for succinyl-CoA: 868.1 > 99.0 (54 eV), 136.3 (54 eV), 259.1 
(54 eV) and 361.3 (54 eV). 


Labile haem reporters targeted to subcellular compartments 
HEK293T cells grown in DMEM with 10% FBS were transiently trans- 
fected in OptiMEM for 8 husing Dharmafect Duo transfection reagent 
(Dharmafect) in 96-well plate format. Peroxidase reporters (pEGFP- 
mitoAPX, pEGFP-APX, pEGFP-NLS-APX, and pmCherry-ER-HRP)” were 
co-transfected with 50 nM siRNA against Pgrmc2, Pgrmc1, the combina- 
tion or ascramble control. After transfection, cells were switched to 
basal medium (DMEM with 10% FBS), basal medium plus 0.5 mM suc- 
cinylacetone, haem-depleted medium (DMEM with10% haem-depleted 
FBS) or haem-depleted medium plus 0.5 mM succinylacetone. Cells 
were lysed 72 h later in 100 pl haem lysis buffer (150 mM NaCl, 20 mM 
HEPES, 0.5% Triton X-100, with Protease Inhibitor Cocktail Set III). Fifty 
microlitres of lysate was incubated with the BioFX TMB One Compo- 
nent HRP microwell Substrate (Surmodics). Absorbance at 620 nm 
was measured after 5 min for the ER-HRP reporter, and after 30 min 
for mitochondrial, nuclear, and cytosolic APX reporters. 


Co-immunoprecipitation 

Endogenous PGRMC2 and PGRMC1 were immunoprecipitated from 
primary brown adipocytes differentiated in vitro using anti-PGRMC2 
and anti-PGRMCI1 antibodies. Cells were lysed in IP lysis buffer 
(150 mM NaCl, 20 mM Tris-HCl, 10% glycerol, 1% Triton X-100 and 
complete EDTA-free protease inhibitor cocktail) and protein quanti- 
fied using the DC assay (Biorad). One milligram of total proteome was 
incubated with 4 pg of anti-PGRMC2, anti-PGRMCI or rabbit IgG con- 
trol antibody pre-bound to 0.75 mg of Dynabeads Protein G (Thermo 
Fisher). After overnight incubation at 4 °C, beads-antibody-protein 
complexes were washed three times with PBS—0.02% Tween 20 for 5 
min at RT, eluted in 50 mM glycine buffer pH = 2.8 for 10 min at 60 °C 
and separated by SDS-PAGE for immunodetection. 


Western blot analysis 

Samples separated by SDS-PAGE were transferred onto nitrocel- 
lulose membranes. Membranes were incubated in blocking buffer 
(TBS-Tween 0.1%, BSA 5% w/v) for [hat room temperature. Membranes 
were incubated overnight at 4 °C with primary antibodies diluted in 
blocking buffer, washed three times for 15 min with TBS-Tween 0.1%, 
and incubated for 1h at room temperature with HRP-conjugated sec- 
ondary antibodies diluted in blocking buffer (1:20,000 dilution). The 
antibodies and dilutions used in this work were: PGRMC2 (1:1,000, 
Bethyl Laboratories, A302-954A and A302-955A), PGRMCI1 (1:1,000, 
Bethyl Laboratories, A304-561A), PPARy, EV-ERBa (1:200, Santa Cruz 
Biotechnology, sc-7273 and sc-100910), BACH1 (1:500, R&D Systems, 
AF5777), UCP1 and OxPhoS (1:5,000 and 1:300, Thermo Fisher Scien- 
tific, PAI24894 and 458099), GAPDH, TUBULIN, and HSP90 (1:5,000, 
GeneTex, GTX627408, GTX27291, and GTX101423), and CEBP6 (1:1,000, 
Abgent, AP20492c). 


Primary adipocyte culture 

Primary brown adipocytes were isolated from the interscapular BAT 
depot of wild-type and PATKO newborn mice. BAT depots were minced 
and digested by shaking for 40 min at 37 °C inisolation buffer contain- 
ing 61.5 mM NaCl, 2.5 mM KCI, 0.65 mM CaCl,, 2.5 mM glucose, 50 mM 
HEPES, 50 U/ml, 50 pg/ml Pen/Strep, BSA 2% (w/v) and 1.5 mg/ml col- 
lagenase type I (Worthington). Cells were filtered through a 70-um 
strainer and plated in DMEM with 25 mM glucose, 20 mM HEPES, 20% 
FBS and Pen/Strep. Differentiation was induced when cells reached 
confluence by switching the medium to DMEM, 10% FBS, 20 nM insu- 
lin, [nM triiodothyronine (T3), 0.5 mM 3-isobutyl-1-methylxanthine 
(IBMX) and 2 pg/ml dexamethasone (Dex). Two days later, medium 
was replaced with DMEM, 10% FBS, 20 nM insulin and 1nM T3. On day 
4 of differentiation, cells were treated with 0.5 mM succinylacetone, 
or switched to haem-depleted FBS, for bioenergetics and gene/pro- 
tein expression studies and analysed at day 7. Exogenous haemin at 
a final concentration of 20 uM was added 48 h before bioenergetics 
studies were performed. On day 7 of differentiation, adipocytes were 
treated with vehicle or 100 nM noradrenaline for 2 h for gene expres- 
sion studies. For complementation experiments, cells were infected 
with lentiviruses expressing mCherry, wild-type human PGRMC2, a 
human PGRMC2 haem-binding mutant (3xM; Y137F, K193A and Y194F) 
at day 0 of differentiation in the presence of 5 pg/ml polybrene. Rev- 
Erba and BACHI1 knockdown in mature adipocytes was performed as 
previously described*. 


Mitochondrial bioenergetics measurements 

The oxygen consumption rate of adipocytes was measured ona Sea- 
horse XFe96 instrument. Primary brown adipocytes differentiated 
in vitro were re-plated at day 5 of differentiation on gelatin-coated 
XFe96 plates at a density of 8,000 cells per well. Two days after plating, 
cells were equilibrated in serum-free DMEM (Sigma-Aldrich D5030) 
containing 25 mM glucose, 10 mM sodium pyruvate, 2 mM glutamine 
and 5 mM HEPES pH 7.4 for 1h before a mitochondrial stress test was 
performed at day 7 consisting of 3 min cycles of mixing and 2 mincycles 
of measurements. Basal respiration rates were measured, followed by 
sequential injections of oligomycin (2 1M), FCCP (1M) and rotenone 
(2 uM) plus antimycin A (RAA, 2 1M). To measure the acute response 
to adrenergic signalling stimulators, compounds were injected using 
one of the ports after measurements of basal respiratory rates were 
complete. Freshly isolated BAT mitochondria (4 pg per well) were trans- 
ferred onto XFe96 plates containing isolation buffer 2 (IB2 = 220 mM 
mannitol, 70 mM sucrose, 10 mM KH,PO,, 5 mM MgCl,, 1 mM EGTA, 
0.5 mM ADP, 2 uM rotenone, 10 mM succinate, 0.2% BSA and 2 mM 
HEPES pH 7.4), and plates centrifugated at 2,000g for 20 min at 4 °C. 
Oxygen consumption rate was measured after sequential injections 
at final concentrations of 4 uM oligomycin, 4 uM FCCP and 4 uM anti- 
mycin A. Each cycle consisted of 30 s of mixing followed by 2.5 min of 
measurements. 


Quantitative PCR and RNA-seq 

Total RNA was isolated from cells and tissues using the Direct-zol 
RNA MiniPrep Plus kit (Zymo Research). Taqman-based quantita- 
tive real-time PCR was performed using the SuperScript III Platinum 
One-Step qRT-PCR reagent (Thermo Fisher Scientific). Samples 
were run in triplicate as multiplexed reactions normalized to an 
internal control (36B4; acidic ribosomal phosphoprotein PO mRNA). 
Sequences of primers and probes used are included in Supplementary 
Information. 

For RNA-seq, total RNA was extracted from BAT of wild-type and 
PATKO mice at 30 °C using the Direct-zol RNA extraction kit (Zymo 
Research). PolyA* RNA was fragmented and prepared into strand-spe- 
cific libraries using the Illumina True-seq stranded RNA kit (Illumina) 
and analysed on an Illumina HiSeq 2500 sequencer. Libraries were 


sequenced using single-end 50-bp reads at a depth of 10-15 million 
reads per library. Single-end sequencing reads were mapped to the 
mouse reference genome (mm49, NCBI37) using STAR (version 2.3.0.c, 
default parameters). Only reads that aligned uniquely to a single 
genomic location were used for downstream analysis (MAPQ >10). 
Gene expression values were calculated for read counts on exons of 
annotated RefSeq genes using HOMER. DEGs were calculated with 
four replicates per condition using EdgeR, and athreshold of adjusted 
Pvalue <0.05 was used to call DEGs. DEGs were used for pathway and 
Gene ontology functional enrichment analysis using Ingenuity Pathway 
Analysis (Qiagen) and Metascape” (http://metascape.org). Heat maps 
were generated using RStudio software (package ‘gplots’). Pie charts 
and Circos plots were generated with Metascape and Adobe Illustra- 
tor. Data are available in GEO (GSE124621). Cell type-specific regula- 
tory elements were download from the ENCODE SCREEN portal, using 
biosample ‘C57BL/6 brown adipose tissue male adult 24 weeks’. BAT- 
specific enhancers as annotated by ENCODE (typically high DNase and 
H3K27ac signal but no H3K4me3 signal) were lifted over tomm49 using 
UCSC LiftOver and associated to genes by proximity (20 kb from TSS). 
Homer 4.9.1 was used to find enriched known and de novo motifs in 
enhancers associated to genes of interest. 


Mouse studies 

All procedures were approved by the Institutional Animal Care and 
Use Committee of The Scripps Research Institute and conducted in 
accordance with relevant ethical regulations. To generate mice with adi- 
pose-specific deletion of Pgrmc2, mice with floxed Pgrmc2alleles®’ and 
backcrossed to the C57BL/6) background (NNT mutant) were crossed 
with an Adipoq-Cre strain*® (JAX stock 010803). Similarly, mice with 
dual deletion of Pgrmc1® and Pgrmc2in adipose tissue were generated 
by crossing mice with floxed Pgrmcl and Pgrmc2 alleles to the Adipoq- 
Cre strain. Floxed littermates without the cre transgene were used as 
controls and are referred to as wild type. Mice were born at room tem- 
perature and moved to 30 °C two weeks after weaning. Experiments 
were performed after a minimum of 4 weeks of acclimatization to 30 °C. 
Mice were kept ona 12-h light-dark cycle and fed standard chow breeder 
diet (5058, Picolab) or 60% HFD (D12492, Research Diets) as specified. 
Male and female mice were used in separate gender-matched experi- 
ments. No gender-specific differences were observed. For molecular 
characterization, mice were euthanized at or around ZTS, extensively 
perfused with ice-cold PBS, and tissues collected and immediately 
frozen in liquid nitrogen. For circadian time-course analysis, wild-type 
and PATKO mice (n=3 per group, per time point) were euthanized every 
4 hover a period of 24 hand tissues harvested as described above. 


Energy balance studies 

Energy balance parameters were determined in acomputer-controlled 
open-circuit system (Oxymax) that is part of an integrated Comprehen- 
sive Laboratory Animal Monitoring System (Columbus Instruments), 
as previously described*°. Body temperature was monitored using a 
rectal probe (RET-3 probe, TH-5 Thermalert Monitoring Thermometer, 
Physitem) in cold exposure experiments, and by radiotelemetry in all 
other experiments. Radiotelemetry was enabled by surgically implant- 
ing atransmitter (TA10TA-F10; Data Sciences) into the peritoneal cavity, 
as previously described*. Male mice (20 weeks old) were allowed to 
recover for 14 d post-surgery and were then acclimated for 3 dto the 
experimental environment before measurements were taken. Data 
were recorded by placing a cage containing a mouse implanted witha 
transmitter on a receiver plate (RPC-1; DataScience). Data collection 
and offline analysis were performed using the DATAQUEST A.R.T. soft- 
ware (DataScience). To test the response to the B,-adrenergic recep- 
tor agonist CL316,243 (1 mg/kg) or an equivalent volume of PBS, was 
administered via intraperitoneal injection to 20-week-old male mice 
housed at thermoneutrality at ZT4.5. Oxygen consumption rate and 
activity levels were monitored using the CLAMS system. 
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Exposure to cold 

Experiments were performed on male and female mice 12-14 weeks of 
age. Mice were individually caged with minimal bedding and free access 
to food and water. To start the cold challenge, they were transferred 
to 4 °C, with controls remaining at 30 °C, and body temperature was 
monitored every 30 min for a total of 2.5 h. In that amount of time, all 
PATKO mice became severely hypothermic and all cold-exposed mice 
were euthanized. Cold challenge experiments started at or around 
ZTS5 (11.00). 


Labile haem quantification 

To purify nuclear and mitochondrial fractions from BAT and liver, one 
lobe of BAT or 100 mg of liver were dounce-homogenized in isolation 
buffer 1 (IB1 = 220 mM mannitol, 70 mM sucrose, 5 mM EGTA, and 
50mM MOPS pH 7.4). After centrifugation at 1.000g for 10 min at 4 °C, 
the nuclear pellet was passed through a 100-pm strainer and washed 
5 times inIB1. Mitochondria in the supernatant were isolated from the 
cytosolic fraction after a second centrifugation at 9,500g for 10 min at 
4 °Cand washed twice with IB1. The nuclear and mitochondrial pellets 
were resuspended in 50 pI dH,O, sonicated and protein content quanti- 
fied using the DC assay (Bio-Rad). Five microlitres of 25 nM apo-HRP 
was incubated in 384-well format with 10 pg in 5 pl of purified mito- 
chondrial or nuclear protein lysates. After 5 min at room temperature, 
40 ul of haem assay buffer (SO 1M Amplex UltraRed, 0.02% H,O, in 
0.1M NaH,PO,/Na,HPO, buffer, pH 6) was added and fluorescence 
(ex.-em. 490/585) measured immediately for 15 min. 


Iron quantification 

Frozen tissue (BAT, about 50 mg) was pulverized and lysed in 50 mM 
NaOH. Non-haemiron content was quantified using 200 pg of protein 
lysate and the ferrozine method as previously described”. 


Blood chemistry measurements 

Blood samples were collected either from the retro-orbital plexus of 
anaesthetized mice, or by cardiac puncture after euthanasia. Plasma 
was separated using BD Microtrainer PST tubes with lithium heparin. 
Triglycerides and non-esterified free fatty acids were measured using 
the Serum Triglyceride Determination Kit (Sigma) and the HR Series 
NEFA-HR(2) kit (Wako). Norepinephrine levels were quantified using 
an ELISA kit (Abnova). Insulin levels were determined using an Ultra- 
Sensitive Rat ELISA Kit (Crystal Chem). 


Tissue lipid content 

Frozen tissue (liver, about 30 mg) was pulverized and lysed in RIPA 
buffer. Triglycerides were quantified in 10 pl of tissue lysate using the 
EnzyChrome Triglyceride Assay kit (EGTA-200, Bioassay Systems). 
Triglyceride content was normalized to tissue weight. 


Treatment with CPAG-1 

C57BL/6) male mice fed a 60 kcal% fat diet (D12492, Research Diets) 
were purchased from The Jackson Laboratory (DIO, JAX stock 380050) 
and keptin the same diet throughout the studies. DIO mice (>12 weeks 
of HFD, 20 weeks of age), randomized based on weight and fasting 
glycaemia, were dosed intraperitoneally with CPAG-1 every other day 
(45 mg/kg in a 2:3:1:4 DMSO:PEG40:ethanol:PBS vehicle solution). 
Weight and fasted glucose levels were monitored weekly. Mice were 
fasted for 16 h before analysis of basal blood chemistry parameters. At 
the conclusion of treatment, tissues were collected and snap-frozen 
for RNA extraction and western blot analysis or fixed for histological 
examination. 


Glucose and insulin tolerance tests 


For glucose tolerance tests, mice were fasted for 6 h, and blood was 
collected from the tail vein before and at timed intervals after oral 


gavage of glucose (1 g/kg). Plasma glucose was measured with a One- 
touch Ultra glucometer (Johnson & Johnson). For insulin tolerance 
tests, mice fasted for 4 h were injected intraperitoneally with insulin 
(0.4 U/kg; Novolin, Novo Nordisk). Glucose levels were determined 
before and at timed intervals after injection of insulin. 


Histology 

Liver, and brown (BAT) and white (WAT) adipose tissue were fixed in 
Z-Fix (Anatech), dehydrated, embeddedin paraffin, and 3-m- (liver and 
BAT) or 10-m- (WAT) thick sections stained with haematoxylin and 
eosin. Cell size was analysed using ImageJ software. 


Electron microscopy 

BAT depots were collected and immediately placed in fixative buffer 
(2.5% paraformaldehyde, 3% glutaraldehyde, 0.02% picric acid in caco- 
dylate buffer, pH 7.3) and stored at 4 °C for 72 h. Fixative buffer was 
refreshed after 48 h. Tissues were extensively washed in 0.1M sodium 
cacodylate buffer (pH 7.3) prior to post-fix incubation in 2% OsO, in 
0.1M sodium cacodylate buffer for 4 h (buffer was refreshed after 
2h). Tissues were then washed in 0.1M sodium cacodylate buffer (pH 
7.3) followed by water. Tissues were dehydrated in a graded ethanol 
series and infiltrated and embedded in Spurr resin (Sigma-Aldrich). 
Thin sections were post-stained with 2% uranyl acetate followed by 
lead citrate and examined ina FEI Philips CM100 electron microscope 
at 80 KV. Images were taken using Radius 1.3 software with a Megaview 
G2 CCD Camera (EMSIS GmbH). 


Statistics 

Results from in vitro assays and cell culture data are presented 
as mean + s.d. Data generated in mouse studies are presented 
as mean + s.e.m. The number of mice used in each experiment is 
indicated in the figure legends. Statistical analysis was performed 
on Prism software (GraphPad) using Student’s t-test for comparisons 
between two groups, one-way ANOVA with multiple comparisons 
for assessment of more than two groups, and two-way ANOVA with 
multiple comparisons for repeated measurements. Comparisons 
among specific groups were done using post-tests as indicated in the 
figure legends. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


Source data tables are provided for Figs. 1-4. and Extended Data Figs. 1-8. 
Full scans of all western blots are shown in the Supplementary Informa- 
tion. RNA-seq data are available in the Gene Expression Omnibus under 
accession number GSE124621. All other data supporting the findings in 
this study are available from the corresponding author upon request. 
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Extended Data Fig. 1| PGRMC2 binds haem and, with PGRMCI1, coordinates 
its intracellular distribution. a, Absorbance spectra of mouse PGRMC2 
protein shows peaks of haem-protein complexes in the 390-450-nm range. 
Dotted spectra indicate haem-protein complexes after 10 mM dithionite 
reduction of the iron moiety. b, LC-MS/MS spectra of haemin standard (left) 
and PGRMC2 protein (right) with collision energy of 40 V. c, Isotope envelope 
of haemin calculated on the basis of isotope natural abundance for 
C34H3,CIFeN,O, (left), PGRMC2 protein (centre) and haemin standard (right). 
d, Purified mouse PGRMC2(3xM) mutant (Y131F/K187A/Y188F) does not bind 
haem.e, The Soret peak typical of haemoproteins is absent in PGRMC2(3xM). 


f, Representative fluorescence imaging of cells expressing targeted HRP or 
APX labile haem reporters, showing their localization to mitochondria, 
endoplasmic reticulum, nucleus and cytosol. g, Levels of Pgrmc2 and Pgrmcl 
mRNA insiRNA-transfected HEK293T cells (n=3 biologically independent 
samples). h, Interaction of PGRMC1 with PGRMC2is not observed when 
PGRMC2 is immunoprecipitated using an antibody that recognizes the haem- 
binding domain at the C terminus of PGRMC2. Representative results from two 
(a-e, h) or three (f, g) independent experiments. Data presented as mean+s.d., 
***P< 0.001 versus scrambled basal; two-way ANOVA with multiple 
comparisons anda Tukey’s post-test. 
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Extended Data Fig. 2|See next page for caption. 
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Extended Data Fig. 2| Pgrmc2is enriched in adipose tissue and regulates 
BAT function. a, PGRMC2 protein levels increase during adipocyte 
differentiation. 3T3-L1 preadipocytes were induced to differentiate and 
protein extracts prepared at the indicated time points. PPARy and CEBP6 are 
markers of mature adipocytes and preadipocytes, respectively. Representative 
results from three independent experiments. b, Profile of Pgrmc2 MRNA 
expression across mouse tissues (n=5 biologically independent samples). 

c, Whole-body and inguinal subcutaneous fat weight of chow-fed wild-type 
and PATKO mice housed at 30 °C (WT, n=8; PATKO, n=9).d, OCR, core body 
temperature, CO, production rate, respiratory exchange ratio (RER), and 
activity oscillations of PATKO mice housed at 30 °C (WT, n=5; PATKO, n=6). 
e, Levels of plasma noradrenaline, glucose and non-esterified fatty acids 


(NEFA) in wild-type and PATKO mice on cold challenge (WT, n=5; PATKO, n=7). 
f, Increased oxygen consumption upon acute injection of the B,-agonist 
CL316,243 (1 mg kg”) is reduced in PATKO mice housed at 30 °C, despite 
comparable motor activity (n=5 biologically independent samples). 

g, Adipose-specific PGRMC1 and PGRMC2 double-knockout mice (DKO) 
housed at 30 °C are cold-intolerant (WT, n=13; DKO, n=8 biologically 
independent samples). Survival curves of wild-type and DKO mice exposed to 
4°C (homeothermiais at 31 °C). Mice were exposed to 4 °C at ZTS. Data 
presented as mean +s.e.m.*P<0.05 and ***P< 0.001 versus wild type; two- 
tailed Student’s f-test (e, f) or two-way ANOVA with multiple comparisons anda 
Tukey’s post-test (g). 
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Extended Data Fig. 3| See next page for caption. 
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Extended Data Fig. 3 | Effect of Pgrmc2 deletion in BAT. BAT from chow-fed 
wild-type and mutant mice housed at 30 °C was analysed. a, Levels of succinyl- 
CoA, glycine and aminolevulinic acid (ALA) in BAT quantified using targeted 
metabolomics (n=5 biologically independent samples per group). b, PATKO 
mice show reduced expression of Alas1 and Alas2in BAT (n=3 biologically 
independent samples per group).c, Nuclear labile haem is significantly lower in 
BAT of fat-specific PGRMC1 and PGRMC2 DKO mice housed at 30 °C (n=4 
biologically independent samples per group). Similar to PATKO mice, BAT of 
DKO mice is discoloured. Representative results from two independent 
experiments. d, Expression of Rev-Erba and BACHI targets (Bmal1 and Fth1, 
respectively) in BAT of PATKO mice housed at 30 °C (WT, n=5; PATKO, n=6). 

e, Circadian oscillation of clock components is not altered in PATKO BAT (n=3 


biologically independent samples per group per time point). f, Gene ontology 
(GO) category analysis (biological process) of significantly downregulated 
genes in RNA-seq analysis of BAT from wild-type and PATKO mice housed at 
30 °C (n=4 biologically independent samples per group). Pvalues determined 
by standard accumulative hypergeometric statistical test. g, Circos plot of 
haem-related DEGs showing that the majority (28 out of 45) of them belong to 
the top-3 downregulated biological processes. Number in parentheses below 
each biological process represents the total number of DEGs in PATKO BAT in 
that category. Blue lines refer to downregulated DEGs and red lines to 
upregulated DEGs. Data presented as mean +s.e.m. *P< 0.05, **P<0.0land 
***P < 0.001 versus wild type determined by two-tailed Student’s t-test. 
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Extended Data Fig. 4| See next page for caption. 
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Extended Data Fig. 4 | Primary brown adipocytes recapitulate the 
mitochondrial defects of PATKO BAT. a, Wild-type and PGRMC2-null primary 
brown adipocytes differentiated in vitro imaged on day eight. Lipid stained 
with Nile red (red) and nuclei stained with Hoechst (blue). Scale bar, 100 um. 
b, Protein levels of adipocyte markers during the course of differentiation. 

c, PGRMC2-null brown adipocytes have impaired mitochondrial respiration 
(n=3).d-h, Lack of PGRMC2 in brown adipocytes results in a defective 
mitochondrial response to endogenous (d), synthetic pan B-adrenergic 
agonists (e) and panB,-adrenergic agonists (f), and to downstream activators 
of adrenergic signalling (g, h) (n=5).i, Induction of noradrenaline-responsive 
genesis similar in wild-type and PGRMC2-null brown adipocytes (n= 3) 
exposed to100 nM noradrenaline for 2 h.j, OXPHOS proteins and UCPlare 
reduced in primary brown PATKO adipocytes. k, PGRMC1 and PGRMC2 DKO 
primary brown adipocytes differentiated in vitro show severe mitochondrial 


dysfunction, an inability to increase oxygen consumption on noradrenaline 
exposure (n=3), and reduced UCP1 and OXPHOS proteins. I, m, Overexpression 
of human wild-type PGRMC2, but not of ahaem-binding mutant (3M (Y137F/ 
K193A/Y194F)), can rescue mitochondrial function and the response to 
noradrenaline in PATKO adipocytes (1, n=4;m, WT-mCherry, WT-WT, PATKO- 
WT, n=8; WT-3xM, PATKO-3xM, n= 7; PATKO-mCherry, n=6).n, Ucp1 mRNA 
expression is restored when human wild-type PGRMC2, but not the haem- 
binding mutant 3xM, is expressed in PATKO cells (n=3). 0, Levels of mouse and 
human Pgrmc2 mRNA in primary adipocytes used inI-n(n=3).Ina-o,n 
represents biologically independent samples. Representative results from two 
(j-o) or three (a-i) independent experiments. Data presented as mean +s.d. 
*P<0.05,**P<0.0l1and ***P< 0.001 versus wild type; ***P< 0.001 versus vehicle; 
two-way ANOVA with multiple comparisons anda Bonferroni's post-test. 
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Extended Data Fig. 5 |PGRMC2-mediated transport of endogenous labile 
haem regulates mitochondrial function in primary brown adipocytes. 
a,b, Inhibition for 48 h of endogenous haem synthesis with 0.5 mM 
succinylacetone (FBS + SA), but not exogenous haem depletion (haem- 
depleted FBS), in wild-type primary brown adipocytes phenocopies the 
mitochondrial defects of PATKO cells (a,n=8;b,n=4).c,d, Treatment with 
succinylacetone (0.5 mM) markedly reduces Ucp1 mRNA and protein levels 
(n=3).e, Exogenous haemin (20 pM) does not correct mitochondrial 
dysfunctionin PATKO cells (n=3).f, PATKO brown adipocytes show higher 


levels of Rev-Erba and BACHI protein. g, Dual knockdown of Rev-Erba and 
BACHLin mature PATKO adipocytes restores mitochondrial respiration (n=5). 
h, Pgrmc2, Rev-Erba (also known as Nrid1) and Bach1 mRNA in control and 
knockdown cells. Ina-h, nrepresents biologically independent samples. 
Representative results from two independent experiments. Data presented as 
meants.d.**P<0.01and ***P< 0.001 versus wild type; ***P< 0.001 versus 
scrambled; two-way ANOVA with multiple comparisons and a Bonferroni's 
post-test. 
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Extended Data Fig. 6 | Body composition of PATKO mice fed aHFD. Wild-type | n=8).d, H&E staining of liver shows increased steatosis in PATKO mice. Scale 


and PATKO mice were fed HFD for 20 weeks. a, Body weight progression (WT, bar, 100 pm. Representative images of seven biologically independent 
n=7;PATKO,n=9).b, BAT of PATKO mice fed HFD is smaller compared to BAT samples. e, PATKO mice fed HFD had more lipid accumulation in liver (n= 8). 
of HFD-fed wild-type mice. No difference was seen in inguinal WAT (iWAT), Ina-e, nrepresents biologically independent samples. Data presented as 


epididymal WAT (eWAT) or liver weight (WT, n=7; PATKO, n=9).c, PATKO mice mean +s.e.m.;*P< 0.05 versus wild type; two-tailed Student’s t-test. 
fed HFD had higher levels of plasma triglycerides and NEFA (WT, n=7; PATKO, 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7 | Analysis of adipose depots of PATKO mice fed a HFD. 
Wild-type and PATKO mice were fed a HFD for 20 weeks. a, H&E stain images of 
BAT from wild-type and PATKO mice ona HFD showsimilar morphology. Insets 
are magnified onthe right. Scale bar, 100 pm. Representative images of seven 
biologically independent samples. b, Gene expression analysis in BAT shows 
reduced levels of Fth1 and Bmal1, targets of BACH1 and Rev-Erba respectively, 
in PATKOBAT (WT, n=7; PATKO, n=8).c, H&E staining of iWAT and eWAT from 
wild-type and PATKO mice fed HFD do not show clear differences. Scale bar, 
100 um. Representative images of seven biologically independent samples. 


d, Size analysis of iWAT and eWAT adipocytes from HFD-fed wild-type and 
PATKO mice. The xaxis indicates area in pm? (n=5 images of biologically 
independent samples). e, Gene expression analysis iniWAT reveals a modest 
increase in expression of genes involved in lipid handling. Similar to BAT, Bmall 
expression is significantly reduced in iWAT of PATKO mice (WT, n=7; PATKO, 
n=9).Ina-e, nrepresents biologically independent samples. Data presented as 
mean +s.e.m.;*P<0.05, **P<0.01and ***P< 0.001 versus wild type; two-way 
ANOVA with multiple comparisons anda Bonferroni's post-test. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Effect of pharmacological activation of PGRMC2in 
DIO mice. DIO mice were treated with CPAG-1 for 30 days. a, Body weight (left) 
and food intake (right) progression (n=8).b, Expression of Pgc-1a and Bmall is 
increased in BAT of treated DIO mice (n=8).c, H&E staining of iWAT shows no 
difference between vehicle- and CPAG-1-treated DIO mice. Scale bar, 100 pm. 

d, Gene expression analysis reveals increased expression of Pgc-Ia and Ucp1 in 
iWAT of CPAG-1-treated DIO mice (n=8).e, H&E staining shows reduced fibrosis 
andimmune cell infiltration in eWAT of DIO mice treated with CPAG-1. Scale bar, 
100 pm. f, Gene expression analysis shows decreased expression of markers of 
inflammation in eWAT of treated mice (n=8). g, H&E staining of liver shows that 


CPAG-1treatment modestly reduces lipid deposition. Scale bar, 100 tm. 

h, Hepatic gene expression analysis shows decreased levels of gluconeogenic 
genes and inflammation markers in liver of treated mice (n=8).i, Treatment 
with CPAG-1 for four days significantly increases nuclear labile haem levels in 
the liver of DIO mice (n=4).Ina-i, nrepresents biologically independent 
samples. Representative images of eight biologically independent samples per 
group (d,e,g). Data presented as mean+s.e.m.;*P<0.05,**P<0.0land 

***P < 0.001 versus vehicle; two-way ANOVA with multiple comparisons anda 
Bonferroni's post-test. 
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Extended Data Fig. 9 | Evaluation of interaction of CPAG-1 with PGRMC1 and scanning. The intensity of the signals indicates the affinity of probe 25 for the 


PGRMC2inlive cells. a, HEK293T cells transfected with expression vectors overexpressed proteins. The black asterisk marks PGRMCI protein andthe red 
for either PGRMC1 or PGRMC2 were treated with 10 uM probe 25 (the asterisk marks PGRMC2 protein. Although detectable, PGRMC1 shows very 
photoreactive form of CPAG-1) and DMSO, 100 uM haemin or 100 pM CPAG-1 poor labelling with probe 25 relative to PGRMC2. Both interactions canbe 

for 30 min followed by UV photocross-linking, lysis and conjugation oflabelled | competed by haemin or CPAG-1. Western blot analysis confirms expression of 
proteomes toa tetramethylrhodamine (TAMRA)-azide tag. Labelled PGRMCland PGRMC2 in transfected cells. Representative results from two 


proteomes were separated by SDS-PAGE and visualized by in-gel fluorescence independent experiments. 
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Extended Data Fig. 10 |PGRMC2is an intracellular haem chaperone critical 
for adipocyte function. Model of the proposed role for PGRMC2in haem 
dynamics in brown adipocytes. PGRMC2 acquires haem from PGRMCI, which 
forms a complex with FECH, the last enzyme in haem synthesis. PGRMC2, 
located inthe endoplasmic reticulum and the nuclear envelope, facilitates 
delivery of labile haem to the nucleus. Nuclear labile haem alters expression of 


genes regulated by haem-responsive transcriptional repressors suchas Rev- 
Erba and BACHIL, which influence mitochondrial bioenergetics. FVLCR1b,a 
mitochondrial haem exporter identified in erythrocytes, and HRG-1, a plasma 
membrane haem importer characterized in macrophages, are also shown. 
FVLCR1b and HRG-1lare bothexpressed in brown adipocytes, but their rolein 
haem dynamics in this cell type remains to be defined. 
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Data analysis RNAseq reads were mapped to the mouse genome mm9 NCBI37 using STAR 2.3.0.c (default parameters). Gene expression values were 
calculated using HOMER 4.9.1. DEGs were calculated with four replicates per group using EdgeR v3.5. DEGs were analyzed with Ingenuity 
Pathway Analysis (version 01-07, QIAGEN) Metascape (http://metascape.org), and HOMER (version 4.9.1, http://homer.ucsd.edu/homer/ 
motif/). Additional software used in data analysis: GraphPad Prism v6.0h, RStudio v1.1.383, Image Lab v5.2.1, Microsoft Excel v16.28. 
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been deposited in GEO under accession number GSE124621. All other data present in this study are available from the corresponding author upon request. 
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Antibodies used The antibodies and dilutions used in this work were: 
PGRMC2 (1:1,000, Bethyl Laboratories, A302-954A and A302-955A) 
PGRMC1 (1:1,000, Bethyl Laboratories, A304-561A) 
PPARg (1:200, Santa Cruz Biotechnology, sc-7273) 
REV-ERBa (1:200, Santa Cruz Biotechnology, sc-100910) 
BACH1 (1:500, R&D Systems, AF5777) 
UCP1 (1:5,000, Thermo Fisher Scientific, PA124894) 
OxPhoS (1:300, Thermo Fisher Scientific, 458099) 
GAPDH (1:5,000, GeneTex, GTX627408) 
TUBULIN (1:5,000, GeneTex, GTX27291) 
HSP90 (1:5,000, GeneTex, GTX101423) 
CEBPd (1:1,000, Abgent, AP20492c) 
Rabbit IgG (Abcam, 37415) 
Anti-rabbit IgG HRP-conjugated (1:10,000, Jackson immunoresearch, 211-035-109) 
Anti-mouse IgG HRP-conjugated (1:20,000, Jackson immunoresearch, 315-035-045) 
Anti-goat IgG HRP-conjugated (1:10,000, Jackson immunoresearch, 705-035-003) 
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Validation Antibodies were validated for the application and species used in this study by their manufacturers, and by the authors by using 
tissues derived from mouse null mutants. Validation data for the antibodies used can be found as follows: 

PGRMC2 https://www.bethyl.com/product/A302-954A/PGRMC2+Antibody 

https://www.bethyl.com/product/A302-955A/PGRMC2+Antibody 

PGRMC1 https://www.bethyl.com/product/A304-561A/PGRMC1+Antibody 

PPARg https://www.scbt.com/scbt/product/ppargamma-antibody-e-8 

REV-ERBa https://www.scbt.com/scbt/product/rev-erbalpha-antibody-rs-14 ?requestFrom=search 

BACH1 https://www.rndsystems.com/products/mouse-bach1-antibody_af5777 

UCP1 https://www.thermofisher.com/antibody/product/UCP1-Antibody-Polyclonal/PA1-24894 

OxPhoS https://www.thermofisher.com/antibody/product/OxPhos-Rodent-WB-Antibody-clone-Cocktail-Cocktail/45-8099 

GAPDH https://www.genetex.com/Product/Detail/GAPDH-antibody-GT239/GTX627408 

TUBULIN https://www.genetex.com/Product ?category=0O&keyword=GTX27291&page=1 

HSPSO https://www.genetex.com/Product ?category=0&keyword=GTX101423&page=1 

CEBPd https://www.abgent.com/products/AW5199-R-Cebpd-Antibody-Center 

Rabbit IgG https://www.abcam.com/rabbit-igg-polyclonal-isotype-control-ab37415.html 

Anti-rabbit IgG HRP-conjugated https://www.jacksonimmuno.com/catalog/products/211-035-109 

Anti-mouse IgG HRP-conjugated https://www.jacksonimmuno.com/catalog/products/315-035-045 

Anti-goat IgG HRP-conjugated https://www.jacksonimmuno.com/catalog/products/705-035-003 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) HEK293T cells were obtained from ATCC (CRL-3216). 
Authentication Short-tandem repeat (STR) profiling. 
Mycoplasma contamination Cells were routinely tested for mycoplasma (at least once every two months) and were always negative. 


Commonly misidentified lines No commonly misidentified cell lines were used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Mice with floxed Pgrmc2 alleles and backcrossed to the C57BL/6J background (NNT mutant) were crossed with an Adipoq-CRE 
strain (JAX stock 010803) to generate mice with adipose-specific deletion of Pgrmc2. Floxed littermates without the CRE 
transgene were used as controls and are referred to as WT. Similarly, mice with dual deletion of Pgrmci1 and Pgrmc2 in adipose 
tissue were generated by crossing mice with floxed Pgrmc1 and Pgrmc2 alleles to the Adipog-CRE strain Mice were born and 
weaned at room temperature and moved to 30°C two weeks after weaning. Studies were performed in male and female mice. 
Primary brown preadipocytes were isolated from male and female pups (0-2 days old). 

C57BL/6 DIO mice were purchased from Taconic at 18 weeks of age. 


Wild animals This study did not involve wild animals. 


Field-collected samples This study did not involve samples collected from the field. 
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Bile acids are abundant in the mammalian gut, where they undergo bacteria-mediated 
transformation to generate a large pool of bioactive molecules. Although bile acids 
are known to affect host metabolism, cancer progression and innate immunity, it is 
unknown whether they affect adaptive immune cells such as T helper cells that 


express IL-17a (T,,17 cells) or regulatory T cells (T,, 


eg cells). Here we screen a library of 


bile acid metabolites and identify two distinct derivatives of lithocholic acid (LCA), 
3-oxoLCA and isoalloLCA, as T cell regulators in mice. 3-OxoLCA inhibited the 
differentiation of T,,17 cells by directly binding to the key transcription factor retinoid- 


related orphan receptor-yt (RORyt) and isoalloLCA increased the differentiation of T. 


reg 


cells through the production of mitochondrial reactive oxygen species (mitoROS), 
which led to increased expression of FOXP3. The isoalloLCA-mediated enhancement 
Of T,.. cell differentiation required an intronic Foxp3 enhancer, the conserved 
noncoding sequence (CNS) 3; this represents a mode of action distinct from that of 
previously identified metabolites that increase T,,, cell differentiation, which require 
CNS1. The administration of 3-oxoLCA and isoalloLCA to mice reduced T,,17 cell 


differentiation and increased T, 


reg Cell differentiation, respectively, in the intestinal 


lamina propria. Our data suggest mechanisms through which bile acid metabolites 
control host immune responses, by directly modulating the balance of T,17 and T,.g 


cells. 


Bile acids are cholesterol-derived natural surfactants that are produced 
inthe liver and secreted into the duodenum. They are critical for lipid 
digestion, antibacterial defence and glucose metabolism’. Although 
95% of bile acids are re-absorbed through the terminal ileum of the 
small intestine and recirculated to the liver, bacteria transform hun- 
dreds of milligrams of bile acids to secondary bile acids with unique 
chemical structures””. In the healthy human gut, concentrations of 
secondary bile acids are in the hundreds of micromolar range”*. Some 
bile acids disrupt cellular membranes owing to their hydrophobic 
nature’, whereas other bile acids protect the gut epithelium’ and confer 
resistance to pathogens such as Clostridium difficile’. Bile acids also 
influence gut-associated inflammation, which suggests their potential 
to regulate gut mucosal immune cells®’. The immune-modulatory 
effects of bile acids have mostly been studied in the context of innate 
immunity’ ”. A recent study reported the cytotoxic effects of bile 
acids on gut-residing T cells”; however, whether these acids modulate 
T cell function directly has not been thoroughly examined. Since the 
identification of digoxin—a plant-derived molecule that contains a 


sterol-like core—as the first T,,17 cell inhibitor that binds to RORyt and 
inhibits its activity“, other structurally similar cholesterol derivatives 
have been identified as RORyt modulators”. Because bile acids belong 
toa family of cholesterol metabolites and exist in the gut (where many 
T,17 cells reside’’), we reasoned that bile acids control T,,17 cell function 
by modulating RORyt activity. 


Screen for T cell modulatory bile acids 


To identify bile acids with modulatory effects on T cells, we screened 
about 30 compounds. Our screen included both primary bile acids that 
are synthesized by the host and secondary bile acids that are produced 
by bacterial modification of primary bile acids (Extended Data Fig. 1). 
Naive CD4* T cells were isolated from wild-type C57BL/6J (hereafter, 
B6 Jax) mice and cultured with bile acids under T,,17-cell differentiation 
conditions and, as acounter-screen, T,..-cell differentiation conditions 
(Extended Data Fig. 2). Notably, two derivatives of LCA were found to 
substantially affect the differentiation of T,17 and T,,., cells. 3-OXoLCA 
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Fig. 1|3-OxoLCA inhibits T,,17 cell differentiation and isoalloLCA enhances 
T,-g cell differentiation. a, b, Flow cytometry and its quantification of 
intracellular staining for IFNy and IL-4, or IL-17a and FOXP3, in sorted naive 
Tcells from wild-type B6 Jax mice activated and expanded in the presence of 


mouse T,1, T,,2, T,17 and T,,, cell polarizing cytokines (n=4,3,4 and 3, 


inhibited T,,17 cell differentiation, as shown by reduced expression of 
IL-17a, and isoalloLCA enhanced T,,, cells, as shown by increased FOXP3 
expression (Fig. 1a, b, Extended Data Fig. 2d, e). Although isoalloLCA 
strongly enhanced FOXP3 expression in the presence of low—but not 
high—concentrations of TGF (Fig. 1a, Extended Data Fig. 3a—c), the 
T,-g-cell-enhancing activity of isoalloLCA required TGFB, as shown by 
the fact that pretreatment of cells with anti-TGFB antibody prevented 
FOXP3 enhancement (Extended Data Fig. 3d, e). 

The modulatory effects of 3-oxoLCA on T,,17 cells, and isoalloLCA 
On T,.g cells, were specific to cell type; neither compound affected 
T cell differentiation into type 1 or type 2 T helper (T,,1 or T,,2) cells, 
as assessed by the expression of the cytokines IFNy and IL-4 and the 
transcription factors T-bet and GATA3 (Fig. la, b, Extended Data Fig. 3f, 
g). Although 3-oxoLCA did not affect T,,, cells (Fig. 1b, Extended Data 
Fig. 2e), isoalloLCA reduced the differentiation of T,,17 cells by about 
50% without affecting RORyt expression (Fig. 1a, b, Extended Data 
Fig. 3h). Both compounds exhibited dose-dependent effects (Extended 
Data Fig. 4a). 3-OxoLCA did not affect cell proliferation, whereas the 
addition ofisoalloLCA to T cells led to reduced proliferation compared 
toa dimethylsulfoxide (DMSO) control (Extended Data Fig. 4b). Treat- 
ment withisoalloLCA did not impair cell viability (Extended Data Fig. 4c) 
or activation meditated by T cell receptor (TCR), as indicated by a simi- 
lar expression of markers of TCR activation (such as CD25, CD69, NUR77 
and CD44) between isoalloLCA and control treatments (Extended Data 
Fig. 4d). TCR activation promotes the enhancement of T,. cells by 
isoalloLCA; increasing TCR activation with higher concentrations of 
anti-CD3 resulted in stronger effects on FOXP3 expression, without 
affecting cell viability (Extended Data Fig. 4e, f). 


3-OxoLCA inhibits T,,17 differentiation 


We next examined whether 3-oxoLCA physically interacts with the 
RORyt protein in vitro. We performed a microscale thermophore- 
sis assay using recombinant human RORyt ligand-binding domain. 
3-OxoLCA exhibited a robust physical interaction with the RORyt 
ligand-binding domain at an equilibrium dissociation constant (K,) 
of about 1 1M. We also tested two additional, structurally similar 3-oxo 
derivatives of bile acids, 3-oxocholic acid (3-oxoCA) and 3-oxodeoxy- 
cholic acid (3-oxoDCA) (Fig. 2a), and found that these derivatives had 
about 20 times higher K, values than that of 3-oxoLCA (Fig. 2b). Neither 
3-0xoCA nor 3-0oxoDCA inhibited T,,17 cell differentiation as robustly 
as did 3-oxoLCA (Fig. 2c, d). Next, we examined whether 3-oxoLCA 
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respectively, biologically independent samples). A low concentration of TGFB 
(0.01 ng mI") was used for T,eg cell culture. DMSO, 3-oxoLCA (20 uM) or 
isoalloLCA (20 pM) was added on day O and CD4' T cells were gated for analyses 
on day 3 for T,,17 and T,,., cells, and day 5 for T,,.and T,,2 cells. Data are 

mean +s.d., by unpaired t-test with two-tailed Pvalue. 


modulates the transcriptional activity of RORyt. We assayed the effect 
of the bile acids on firefly luciferase expression directed by a fusion 
protein of RORyt and GAL4 DNA-binding domain in human embryonic 
kidney (HEK) 293 cells. Cells treated with ML209, a specific RORyt 
antagonist, completely lost RORyt activity”. Similarly, treatment with 
3-oxoLCA significantly reduced the activity of the RORyt reporter 
(Fig. 2e). These data suggest that 3-oxoLCA probably inhibits T,,17 cell 
differentiation by physically interacting with RORyt, and inhibiting its 
transcriptional activity. 


IsoalloLCA promotes T,,, differentiation 


We next sought to uncover the mechanism by which isoalloLCA exerts 
its enhancing effects on T,., cells. LCA has a 3a-hydroxyl group as well 
as acis 5B-hydrogen configuration at the A-B ring junction and can 
undergo isomerization, presumably via the actions of gut bacterial 
enzymes’, to form isoLCA (38,58), alloLCA (30,50) or isoalloLCA (38,50) 
(Fig. 3a). Among these LCA isomers, isoalloLCA has the lowest log D 
value (2.2), comparable to the previously reported”°” log D values of 
chenodeoxycholic acid (2.2) and ursodeoxycholic acid (2.2) (Extended 
Data Table 1), which suggests that isoalloLCA is less lipophilic than the 
other isomers. IsoalloLCA, but not the other LCA isomers, enhanced 
FOXP3 expression, confirming that both the 3B-hydroxyl group and 
trans (Sa-hydrogen) A-B ring configuration of isoalloLCA are required 
for enhancement of T,.g cells (Fig. 3b). Compared to cells treated with 
DMSO, cells treated with isoalloLCA inhibited the proliferation of 
Teffector cells in vitro, indicating they had acquired regulatory activity 
(Extended Data Fig. 5a, b). T cells isolated from FOXP3-GFP reporter 
mice exhibited both increased expression of Foxp3 mRNA (Fig. 3c) and 
enhanced GFP levels after treatment with isoalloLCA (Extended Data 
Fig. 5c). Thus, the enhanced expression of FOXP3 induced by isoalloLCA 
occurs at the level of Foxp3 mRNA transcription. 

Foxp3 transcription is regulated by three conserved non-coding 
enhancers known as CNSI1, CNS2 and CNS3 (Fig. 3d), each of which 
has a distinct role in the development, stability and function of T,.. 
cells””*. Small molecules that promote T,,, cells, such as the bacte- 
rial metabolite butyrate and the vitamin A derivative retinoic acid, 
enhance FOXP3 expression in a CNS1-dependent manner”°. TGFB 
also partially requires CNS1 for its T,,.,-cell-promoting activity, owing 
to the binding of its downstream signalling molecule SMAD3 to the 
CNS1 enhancer”*”’. Whereas CD4" T cells from mice with deletions in 
CNS1 and CNS2 upregulated FOXP3 in response to isoalloLCA, cells 
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Fig. 2|3-OxoLCA binds to RORyt and inhibits its transcriptional activity. 

a, Chemical structures of 3-oxoLCA, 3-oxoCA and 3-oxoDCA. b, Microscale 
thermophoresis assay. 3-OxoLCA binds to the RORyt ligand-binding domain at 
amuch lower K, value than do the other two structurally similar bile acids. 

c,d, Flowcytometric analyses and quantification of IL-17a production from 
mouse naive CD4* T cells cultured for 3 days under the T,,17 cell polarization 
condition (n= 3 biologically independent samples per group). DMSO or bile 
acids at 20 uM were added 18 h after cytokine addition. e, RORyt luciferase 
reporter assay in HEK 293 cells treated with a positive control ML209 (2 uM), 
3-o0xoLCA (10 1M), 3-0xoCA (10 LM), 3-oxoDCA (10 pM) or DMSO. The ratio of 
firefly luciferase (FLuc) to Renilla luciferase (RLuc) activity is presented onthe 
yaxis (n=3 biologically independent samples per group). Dataare mean ¢+s.d., 
by unpaired t-test with two-tailed Pvalue. 


that lack CNS3 did not respond (Fig. 3e, f). By contrast, retinoic acid 
and TGFf boosted T,,. cell differentiation in CNS3-deficient cells, albeit 
with reduced efficiency (Extended Data Fig. 5d). Thus, unlike other 
small molecules that promote T,,, cell differentiation (which dosoina 
CNS1-dependent manner), the FOXP3-enhancing activity of isoalloLCA 
requires CNS3. 

We investigated other known regulators of T cell function. The 
transcription factor REL binds to the CNS3 enhancer to induce FOXP3 
expression”. We found that wild-type and REL-deficient cells express 
similar levels of FOXP3 upon treatment with isoalloLCA (Extended 
Data Fig. 5e, f). LCA targets the vitamin D receptor (VDR)” and the 
farnesoid X receptor (FXR)””. VDR has also previously been implicated 


in the modulation of both T,17 and T,,, cell function®*® **. Compared 


to acontrol treated with DMSO, isoalloLCA-treated cells deficient in 
VDR or FXRhad similar amounts of FOXP3 induction (Extended Data 
Fig. 5g). Thus, the CNS3-dependent activation of FOXP3 by isoalloLCA 
is unlikely to be mediated through the actions of REL, VDR or FXR. 
VDR and FXR also did not contribute to the suppressive activities of 
3-oxoLCA on T,17 cells (Extended Data Fig. 5h). Of note, conjugating 
glycine to 3-oxoLCA or isoalloLCA reduced the immunomodulatory 
effects of these acids (Extended Data Fig. 5i-k). 

CNS3 has previously been implicated in T,., cell development by pro- 
moting epigenetic modifications, such as H3K27 acetylation (H3K27ac) 
and H3K4 methylation, at the Foxp3 promoter region”. Compared to 
cells treated with DMSO, cells treated with isoalloLCA had increased 
levels of H3K27ac in the Foxp3 promoter region (Extended Data Fig. 6a). 
Consistent with this, treatment with isoalloLCA increased recruitment 
of the histone acetyltransferase p300 (Extended Data Fig. 6b). However, 
isoalloLCA did not affect H3K4 methylation (Extended Data Fig. 6c). 
The pan-bromodomain inhibitor iBET, which antagonizes H3K27ac, 
prevented the isoalloLCA-dependent enhancement of FOXP3 ina dose- 
dependent manner (Extended Data Fig. 6d, e). Consistent with previous 
work”, CNS3 deficiency not only reduced basal levels of H3K27ac but 
also abrogated the isoalloLCA-dependent increase in H3K27ac levels at 
the Foxp3 promoter region (Extended Data Fig. 6f). Therefore, CNS3 
is probably needed to establish a permissible chromatin landscape, 
whereupon the promoter region is further acetylated after treatment 
with isoalloLCA. 


MitoROS enhances FOXP3 expression 


Cellular metabolism and epigenetic modification are intricately related; 
for example, the by-products of mitochondrial metabolism serve as 
substrates for histone acetylation and methylation™. T,,., cells rely 
mainly on oxidative phosphorylation for their energy production® *”. 
Recent studies have identified two metabolites, 2-hydroxyglutarate 
and D-mannose, that promote T,,, cell generation by modulating mito- 
chondrial activities**”’. To assess whether isoalloLCA affects oxidative 
phosphorylation, we measured the oxygen consumption rate in T cells 
cultured for 48 hafter treatment with DMSO or isoalloLCA. At this time 
point, FOXP3 is not yet strongly induced, which makes it possible to 
assess the effects of isoalloLCA on cellular metabolism before cells are 
fully committed to becoming T,,, cells. Compared to DMSO, treatment 
with isoalloLCA increased the oxygen consumption rate in both wild- 
type and CNS3-knockout cells (Extended Data Fig. 6g), suggesting that 
treatment with isoalloLCA increases mitochondrial activity. Reactive 
oxygen species (ROS) are produced as by-products of mitochondrial 
oxidative phosphorylation. Whereas D-mannose increases cytoplasmic 
ROS production”, treatment with isoalloLCA led to increased produc- 
tion of mitoROS without affecting cytoplasmic ROS (Fig. 3g, Extended 
Data Fig. 6h, i). Unlike isoalloLCA, other isomers of LCA did not increase 
mitoROS production (Fig. 3g). Furthermore, isoalloLCA-treated cells 
displayed a modest, but significant increase in total mitochondrial 
mass and mitochondrial membrane potential (Extended Data Fig. 6j, 
k). To test whether mitoROS is directly involved in enhanced T,,. cell dif- 
ferentiation by isoalloLCA, we used mitoQ (a mitochondrially targeted 
antioxidant) to reduce ROS levels in mitochondria (Extended Data 
Fig. 6l). Importantly, in the presence of mitoQ, isoalloLCA was no longer 
effective in enhancing T,., cell differentiation (Fig. 3h, i). By contrast, the 
retinoic-acid-dependent induction of T,,, cells was unaffected by treat- 
ment with mitoQ (Fig. 3h, i). We next investigated whether mitoROS 
production is responsible for the enhanced levels of H3K27ac at the 
Foxp3 promoter seen in cells treated with isoalloLCA. Co-treating cells 
with isoalloLCA and mitoQ decreased levels of H3K27ac, as compared 
to levels in cells treated with isoalloLCA only (Extended Data Fig. 6m). 
Stronger TCR stimulation that enhances FOXP3 expression (Extended 
Data Fig. 4d, e) also increased mitoROS production*? (Extended Data 
Fig. 6n). Although TGFB—which is essential for the FOXP3-enhancing 
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Fig. 3 | MitoROS is necessary and sufficient for the isoalloLCA-dependent 
enhanced expression of FOXP3. a, Chemical structures of LCA and its isomers: 
isoLCA, alloLCA and isoalloLCA. b, FOXP3 expression from mouse naive CD4* 
Tcells cultured for 3 days with anti-CD3/28 and IL-2. DMSO or bile acids at 

20 uM were added to cell culture (n =3 biologically independent samples per 
group).c, Quantitative PCR (qPCR) analysis for Foxp3 transcripts in cells 
treated with DMSO or isoalloLCA (20 tM) (n= 3 biologically independent 
samples per group). d, Diagram of the Foxp3 gene locus containing the 
promoter region (Pro) and intronic enhancer regions (CNS1, CNS2 and CNS3). 
e, f, Flow cytometric analyses and quantification of CD4’ T cells stained 
intracellularly for FOXP3. Naive CD4’ T cells isolated from wild-type control, 
CNS1-, CNS2- or CNS3-knockout mice were cultured with anti-CD3/28 and IL-2, 
inthe presence of DMSO or isoalloLCA (20 EM) (n=3 biologically independent 
samples per group). g, MitoROS production measured by mitoSOX staining 


activity of isoalloLCA (Extended Data Fig. 3d, e)—was not required for 
increased mitoROS production (Extended Data Fig. 60), it was required 
for the isoalloLCA-induced H3K27ac at the Foxp3 promoter (Extended 
Data Fig. 6m). Because FOXP3 itself enhances mitochondrial oxidative 
phosphorylation”, and FOXP3-expressing T,,, cells had higher levels 
of mitoROS than those of other subsets of CD4* T cells (Extended Data 
Fig. 6p), we investigated whether increased mitoROS production was a 
secondary effect of enhanced FOXP3 expression. CNS3-deficient cells 
that did not express high levels of FOXP3 in response to treatment with 
isoalloLCA nevertheless exhibited enhanced oxidative phosphoryla- 
tion and increased levels of mitoROS (Fig. 3e, f, Extended Data Fig. 6g, 
q). We then investigated whether mitoROS is sufficient to promote 
Tg cell differentiation using the mitochondria-targeted redox cycler, 
mitoParaquat (mitoPQ)*. The addition of mitoPQ toa T cell culture was 
sufficient to enhance mitoROS production and T,,, cell differentiation 
ina dose-dependent manner (Extended Data Fig. 6r,s). The T,., cell dif- 
ferentiation induced by mitoPQ, similar to that induced by isoalloLCA, 
required the CNS3 enhancer and TGFFf (Fig. 3), k, Extended Data Fig. 6t). 
Together, our data support a model in which isoalloLCA promotes T,.. 
cell differentiation by enhancing mitoROS production and increas- 
ing H3K27ac at the Foxp3 promoter region, which also requires TGFB- 
induced signalling (Extended Data Fig. 6u). 


Bile acids set T cell activities in vivo 

We next examined whether 3-oxoLCA and isoalloLCA influence T,,17 
and T,,, cell differentiation in vivo, using a mouse model. Segmented 
filamentous bacteria (SFB), amurine commensal, are known to induce 
T,,17 cell differentiation in the small intestine of B6 mice’. C57BL/6NTac 
mice, from Taconic Biosciences (hereafter, B6 Tac), have abundant T,,17 
cells in their small intestine, owing to the presence of SFB. By contrast, 
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with T cells cultured in the presence of DMSO or LCA isomers for 48 h. Staining 
intensity is reported as mean fluorescence intensity from flowcytometry 
analysis (PE channel). Different conditions were then normalized as the fold 
change relative to the values of DMSO condition (n=3 biologically independent 
samples per group). h, i, Representative fluorescence-activated cell sorting 
(FACS) plots and quantification of T cells stained intracellularly for FOXP3, 
cultured with anti-CD3/28, IL-2 and TGF (0.05 ng mI) inthe presence of 
DMSO,LCA, isoalloLCA (20 1M) or retinoic acid (RA) (nM), with DMSO or 
mitoQ (0.5 uM) for 72h (n=3 biologically independent samples per group). 
j,k, Flowcytometric analyses and quantification of CD4’ T cells stained 
intracellularly for FOXP3. Naive CD4*T cells isolated from control or CNS3- 
knockout mice were cultured with anti-CD3/28 and IL-2 in the presence of 
DMSO or mitoPQ (10 pM) (n=3 biologically independent samples per group). 
Data are mean +s.d., by unpaired t-test with two-tailed Pvalue. 


B6 Jax mice—which lack SFB—have few intestinal T,,17 cells. To deter- 
mine whether 3-oxoLCA suppresses T,17 cell differentiation in vivo, we 
gavaged B6 Jax mice with a faecal slurry containing SFB and fed these 
mice either a control diet or chow containing 0.3% (w/w) 3-oxoLCA for 
1week (Fig. 4a). The resulting average concentration of this metabolite 
incaecal contents was 24 pmol mg‘ of wet mass (approximately equiva- 
lent to micromolar) (Extended Data Fig. 7a, b). This concentration was 
sufficient to suppress T,,17 differentiation in vitro (Fig. 1c). Indeed, 
treatment with 3-oxoLCA significantly reduced the percentage of ileal 
T,,17 cells (Fig. 4b). When we quantified the average levels of 3-oxoLCA 
inthe stool of human patients with ulcerative colitis or in the caeca of 
conventionally housed mice, we observed a mean concentration of 23 
or 1.0 pmol mg“, respectively (Extended Data Fig. 7c, d). Levels of SFB 
colonization were comparable between control and 3-oxoLCA-treated 
groups of mice, which suggests that the change in the percentage of 
T,,17 cells was not due to a decrease in SFB colonization (Extended 
Data Fig. 7e). In addition, B6 Tac mice (which have pre-existing SFB) 
had reduced percentages of T,,17 cells when fed 3-oxoLCA, compared 
to when fed with vehicle (Extended Data Fig. 7f-h). Treatment with 
3-oxoLCA did not affect percentages of T,,, cells (Extended Data Fig. 7i). 
Even under the gut inflammatory conditions induced by anti-CD3 injec- 
tion (which are known to producea robust T,,17 cell response!*“*), mice 
treated with 1%—but not with 0.3%—3-oxoLCA had reduced the percent- 
age of T,,17 cells (Extended Data Fig. 7j-l). 

To examine the effects of isoalloLCA on T,,, cells in vivo, we fed 
SFB-colonized B6 Tac mice a control diet or a diet containing 0.03% 
(w/w) isoalloLCA. IsoalloLCA alone was insufficient to increase the 
percentage of T,,, cells both at steady-state (Extended Data Fig. 7m) 
and after treatment with anti-CD3 (Extended Data Fig. 7n). We noted 
that 3-oxoLCA further enhanced the T,,, cell differentiation induced 


reg 


by isoalloLCA in vitro (Extended Data Fig. 70, p). Consistent with this 
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Fig. 4|3-OxoLCA inhibits T,,17 development and isoalloLCA enhances T,,, 
cells in vivo. a, b, Experimental scheme (a) and flowcytometric analysis (b) of 
T,,17 cell induction by SFB. B6 Jax mice were gavaged with SFB-rich faecal 
pellets and kept on 3-oxoLCA (0.3%) for a week (n=11 mice per group). 

c,d, Experimental scheme (c) and flow cytometric analysis (d) of anti-CD3 
experiment with a mixture of 3-oxoLCA + isoalloLCA (n=10 or 9 mice for 
control or 3-oxoLCA + isoalloLCA treatment, respectively). B6 Tac mice 

were intraperitoneally injected with anti-CD3 and fed acontrol diet ora 
mixture of 3-oxoLCA (0.3%) + isoalloLCA (0.03%) during the experiments. 


observation, administration of a mixture of 0.3% (w/w) 3-oxoLCA and 
0.03% (w/w) isoalloLCA significantly enhanced the T,,, cell population 
in mice treated with anti-CD3, compared to that of mice with a control 
diet (Fig. 4c, d). Consistent with the mechanism in vitro, this treatment 
with the 3-oxoLCA and isoalloLCA mixture led to increased mitoROS 
production among CD4* T cells in the ileal lamina propria (Extended 
Data Fig. 7q). Importantly, the enhanced expression of FOXP3 induced 
by the 3-oxoLCA and isoalloLCA mixture in vivo was also dependent on 
the CNS3 enhancer, as shown by the fact that CNS3-knockout cells— 
unlike wild-type cells—no longer responded to this treatment ina mixed 
bone marrow experiment (Fig. 4e, f). Feeding both isoalloLCA and 
3-oxoLCA in chow resulted in an average concentration of 47 pmol mg? 
isoalloLCA in caecal contents (Extended Data Fig. 7b). This concentra- 
tion was sufficient to enhance T,,, cell differentiation in vitro (Fig. Ic). 
The mean concentration ofisoalloLCA in the stool of patients with ulcer- 
ative colitis was 2 pmol mg™, and ranged between 0 and 17 pmol mg? 
(Extended Data Fig. 7c). These values are within an order of magnitude 
of the concentrations observed in mice fed 0.03% isoalloLCA and 0.3% 
3-oxoLCA, suggesting that the in vivo levels of isoalloLCA achieved are 
physiologically relevant. 

We next asked whether the immunomodulatory roles of 3-oxoLCA 
and isoalloLCA are mediated through changes in the composition of the 
gut bacterial community. 16S rDNA sequencing with faecal samples of 
mice fed diets containing bile acids revealed no substantial perturba- 
tions in the gut bacterial community, compared to mice fed a control 
diet (Extended Data Fig. 8a—e). Furthermore, treatment with 3-oxoLCA 
reduced T,,17 cell induction in the colons of germ-free B6 mice infected 
with Citrobacter rodentium (Extended Data Fig. 8f, g). Thus, the T,,17 and 
Tyeg cell modulatory activities of 3-oxoLCA and isoalloLCA probably do 
not require the presence of acommunity of commensal bacteria. These 
data suggest that both 3-oxoLCA and isoalloLCA directly modulate T,,17 
and T,,, cell responses in mice in vivo. 

Finally, we investigated whether the in vitro treatment of T cells with 
isoalloLCA produced T,,, cells competent to exert suppressive function 
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e, f, Experimental scheme (e) and flow cytometric analysis (f) of T cells isolated 
from the ileal lamina propria. Bone marrow cells from wild-type (CD45.1) and 
CNS3-knockout (CD45.2) mice were mixed at a1:1 ratio and transferred into 
irradiated wild-type (CD45.1) recipient mice. Five weeks after the transfer, 
recipient mice were fed acontrol diet or a diet containing a mixture of 
3-0xoLCA (0.3%) + isoalloLCA (0.03%), followed by an anti-CD3 injection 
(n=10 mice per group). Datashown as mean +s.d. by unpaired t-test with two- 
tailed Pvalue. 


in vivo. The same number of FOXP3* T cells (CD45.2), sorted from T cell 
cultures with low or high TGF concentrations (TGFB'™ or TGFBM2" 
Tig cells, respectively) in the absence or presence of isoalloLCA, were 
adoptively transferred into RAGI-knockout mice that had also received 
CD45RB"*" naive CD4" T cells (CD45.1) (Extended Data Fig. 9a, b). Mice 
that received CD45RB"*" or CD45RB"*" and TGFB' T,.,. cells developed 
substantial weight loss and shortened colon phenotypes, both of which 
are indicators of symptoms associated with colitis (Extended Data 
Fig. 9c-f). By contrast, the adoptive transfer of isoalloLCA-treated T,.. 
cells protected mice from developing symptoms associated with colitis 
to the same degree as mice that received TGFB"*" T,,,. cells (Extended 
Data Fig. 9c-f). T,.. cells treated with isoalloLCA were more stable in 
terms of FOXP3 expression—as compared to TGFB'™ T,,. cells treated 
with DMSO-—when analysed eight weeks after transfer (Extended 
Data Fig. 9g-j). In addition, mice that received isoalloLCA-treated T,.. 
cells had reduced numbers of CD45.1° T effector cells (Extended Data 
Fig. 9k). Therefore, isoalloLCA probably promotes the stability of T,.. 
cells and enhances their function after adoptive transfer in vivo, lead- 
ing to decreased proliferation of T effector cells. 


Discussion 


Some bile acids are thought to be tissue-damaging agents that promote 
inflammation, owing to their enhanced accumulation in patients with 
liver diseases and their chemical properties as detergents that disrupt cel- 
lular membranes*. However, recent studies have begun to reveal the anti- 
inflammatory roles of bile acids, particularly inthe innate immune system 
by suppressing NF-KB-dependent signalling pathways**”’ and by inhib- 
iting NLRP3-dependent inflammasome activities”. Our studies reveal 
additional anti-inflammatory roles for two LCA metabolites found in 
bothhumansand rodents**” that directly affect CD4* T cells:3-oxoLCA 
suppresses T,,17 cell differentiation and isoalloLCA enhances T,,, cell 
differentiation. Our data suggest that both 3-oxoLCA and isoalloLCA 
are present in the stool samples of patients with colitis as well as in the 
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caeca of conventionally-housed B6 Jax mice (Extended Data Fig. 7c, d). 
Importantly, both bile acids are completely absent in germ-free B6 mice 
(Extended Data Fig. 7d). These data suggest that gut-residing bacteria 
may contribute to the production of 3-oxoLCA andisoalloLCA, although 
wecannotrule out the possibility that host enzymes are involved. Given 
the critical roles of T,,17 and T,,, cells in a wide variety of inflammatory 
diseases and their close relationship with gut-residing bacteria, our 
study suggests the existence of novel modulatory pathways that regulate 
Tcell function through bile acid metabolites. Future studies to elucidate 
the bacteria or host enzymes that generate 3-oxoLCA and isoalloLCA 
will provide the means for controlling T cell function in the context of 
autoimmune diseases and other inflammatory conditions. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized and investigators were not blinded 
to allocation during experiments and outcome assessment. 


Mice 

C57BL/6, FOXP3-GFP, FXR-knockout, VDR-knockout, CD45.1 and RAGI- 
knockout mice were purchased from the Jackson Laboratory. SFB-con- 
taining CS7BL/6NTac mice were purchased from Taconic Bioscience. 
FOXP3-CNS-knockout and control mice were provided by the Y. Zheng 
laboratory. All mouse procedures were approved by the Institutional 
Animal Care and Use Committee at Harvard Medical School. 


Chemical synthesis of 3-oxoLCA, isoalloLCA, glyco-3-oxoLCA, 
and glyco-isoalloLCA 

Detailed synthesis methods and characterization data are included 
in Supplementary Information. 


Measurement of lipophilicity 

Partitioning method. Two microlitres of 10 mM stock solutions of 
each target compound was added to 1 ml each of S0 mM ammonium 
bicarbonate (pH = 8), and1ml of n-octanol in an Eppendorf tube. The 
resulting two-phase mixture was vortexed and then shaken for 18 h 
at 20 °C. The two phases were then carefully separated into the aque- 
ous sample (bottom layer), and organic sample (top layer) and placed 
separately in autosampler vials and analysed by liquid chromatography 
with tandem mass spectrometry (LC-MS/MS). 


LC-MS method. A Thermo q-Exactive Plus LC-MS equipped with an 
Ultimate 3000 HPLC was operated in negative ion mode after opti- 
mized to detect the [M —- H] of the bile acids. Mobile phase A was 5 
mM ammonium acetate with 0.012% formic acid and mobile phase B 
was HPLC-grade methanol. A Dikma Inspire C8 column (3-um particle 
size, 1OO-mm length, 4.6-mm inner diameter) was used for analysis. 
Each injection was 5 pl, and a constant flow rate of 0.400 I/min was 
used. The gradient started at 0% B and was held constant for 2 min. 
Then, the mobile phase composition was linearly changed to 100% B 
over 8 min and held at 100% for the following 5 min. The mobile phase 
composition was changed to 0% B over the following 0.1 min, and the 
system was allowed to equilibrate to starting conditions over 1.9 min. 
Standards of all targets were used to establish retention times. Better- 
than-2-ppm mass accuracy was obtained on all measurements. The 
LC-MS analysis was done in triplicate, and the partitioning was done 
once forthe [M-—H] ion of each target. 


Calculation of log D. The total response of target in octanol was di- 
vided by the total response in the aqueous phase to get a partitioning 
coefficient. Then, log,, was taken and reported. 


Human faecal specimens 

Faecal samples were obtained from patients with active ulcerative 
colitis under an Institutional-Review-Board-approved protocol and 
informed consent was obtained at Weill Cornell Medicine. Active inflam- 
mation was defined by a Mayo endoscopic score of > 1. 


In vivo bile acid analysis 

Stock solutions of all bile acids were prepared by dissolving the com- 
pounds in molecular-biology-grade DMSO (Sigma Aldrich). These 
solutions were used to establish standard curves. Glycocholic acid or 
6-muricholic acid (Sigma Aldrich) was used as the internal standard 
for mouse and human samples, respectively. Bile acids were extracted 
from mouse caecal and human faecal samples, and quantified by ultra- 
high-performance liquid chromatography-mass spectrometry (UPLC- 
MS) as previously reported™. The limits of detection of individual bile 


acids in tissues (in pmol/mg wet mass) are as follows: B-muricholic 
acid, 0.10; isoalloLCA, 0.45; isoLCA, 0.29; LCA, 0.12; alloLCA, 0.43; 
and 3-oxoLCA, 0.18. 


Invitro T cell culture 

Naive CD4* (CD62L*CD44°CD25 CD4") T cells were isolated from the 
spleens and the lymph nodes of mice of designated genotypes, using 
FACS. For some experiments, naive CD4* T cells were enriched using 
naive CD4* T cell isolation kits (Miltenyi). Naive CD4* T cells (40,000 
cells) were cultured ina 96-well plate pre-coated with hamster IgG (MP 
Biomedicals) in T cell medium (RPMI, 10% fetal bovine serum, 25 mM 
glutamine, 55 1M 2-mercaptoethanol, 100 U/ml penicillin, 100 mg/ 
ml streptomycin) supplemented with 0.25 pg/ml anti-CD3 (clone 145- 
2C11) and1 pg/ml anti-CD28 (clone 37.51). For naive T cell (T,,0) culture, 
T cells were cultured with the addition of 100 U/ml of IL-2 (Peprotech). 
For T,1 cell differentiation, T cells were cultured with the addition of 
100 U/ml of IL-2, 10 pg/ml of anti-IL-4 (clone 11B11) and10 ng/ml of IL-12 
(Peprotech). For T,,2 cell differentiation, T cells were cultured withthe 
addition of 10 pg/ml of anti-IFNy (clone XMGI1.2) and 10 ng/ml of IL-4 
(R&D Systems). For T,,17 cell differentiation, T cells were cultured with 
the addition of 10 ng/ml of IL-6 (eBioscience) and 0.5 ng/ml of TGFB 
(Peprotech). For T,.gcell culture, T cells were cultured with the addition 
of 100 U/ml of IL-2 and various concentrations of TGF. For most in vitro 
experiments to test the effects of isoalloLCA, no additional TGFB was 
added. Bile acids, retinoic acid (Sigma), mitoQ (Focus Biomolecules) 
or mitoPQ (Sigma) were added either at O-h or 16-h time points. Com- 
pounds with low water solubility were sonicated before adding to the 
culture. Cells were collected and assayed by flowcytometry on day 3. For 
ROS and mitochondrial membrane potential detection, cells cultured 
for 2 days were incubated with 5 uM of mitoSOX (ThermoFisher), 10 
uM of DCFDA (Sigma) or 2 pM of JC-1 (ThermoFisher) for 30 min and 
assayed with flow cytometry. 


Flow cytometry 

Cells collected from in vitro culture or in vivo mice experiments were 
stimulated with 50 ng/ml phorbol 12-myristate 13-acetate (PMA) 
(Sigma) and 1 uMionomycin (Sigma) in the presence of GolgiPlug (BD) 
for 4hto determine cytokine expression. After stimulation, cells were 
stained with cell-surface marker antibodies and LIVE/DEAD Fixable dye, 
Aqua, to exclude dead cells, fixed and permeabilized with a FOXP3/ 
transcription factor staining kit (eBioscience), followed by staining 
with cytokine- and/or transcription-factor-specific antibodies. All flow 
cytometry analyses were performed on an LSRII flow cytometer (BD) 
and data were analysed with FlowJo software (TreeStar). 


Cell proliferation assay 

Naive CD4* T cells were labelled with 1 1M carboxyfluorescein suc- 
cinimidyl ester (CFSE, BioLegend) and cultured for three days before 
FACS analysis. 


In vitro suppression assay 

A total of 2.5 x 10* freshly purified naive CD4*CD25 CD44 CD62L"*" 
T (Teony) cells from CD45.1 B6 mice were labelled with 1 1M CFSE, acti- 
vated with soluble anti-CD3 (1 pg/ml) and 5 x 10* APCs in 96-well round- 
bottom plates for 3 days in the presence of tester cells (CD45.2). The 
CFSE dilution of CD45.1T.,,, cells was assessed by flow cytometry. 


Mammalian luciferase reporter assay 

Reporter assays were conducted as previously described”. In brief, 
50,000 HEK 293 cells per well were plated in 96-well plates in anti- 
biotic-free Dulbecco’s Modified Eagle medium (DMEM) containing 
1% fetal calf serum (FCS). Cells were transfected with a DNA mixture 
containing 0.5 pg/ml of firefly luciferase reporter plasmid (Promega 
pGL4.31 (luc2P/Gal4UAS/Hygro)), 2.5 ng/ml of a plasmid contain- 
ing Renilla luciferase (Promega pRL-CMV), and GAL4-DNA binding 
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domain-RORy (0.2 pg/ml). Transfections were performed using 
TransIT-293 (Mirus) according to the manufacturer’s instruction. 
Bile acids or vehicle control were added 24 h after transfection and 
luciferase activity was measured 16 h later using the dual-luciferase 
reporter kit (Promega). 


Microscale thermophoresis assay 

The binding affinity of the compounds with RORy ligand-binding 
domain was analysed by microscale thermophoresis (MST). Purified 
RORy ligand-binding domain was labelled with the Monolith NT Protein 
Labelling Kit RED (NanoTemper Technologies). Serially diluted com- 
pounds, with concentrations of 1mM to 20 nM, were mixed with 55nM 
labelled RORy ligand-binding domain at room temperature and loaded 
into Monolith standard-treated capillaries. Binding was measured by 
monitoring the thermophoresis with 20% LED power and ‘medium’ MST 
power ona Monolith NT.115 instrument (Nano Temper Technologies) 
with the following time setting: 5 s Fluo, before; 20 s MST on; and 5s 
Fluo, after. K, values were fitted using the NT Analysis software (Nano 
Temper Technologies). 


qPCR with reverse transcription 

Total RNA was isolated from cultured T cells using an RNeasy kit (Qiagen) 
and reverse-transcribed using a PrimeScript RT kit (Takara). All qPCRs 
wererun onthe Bio-Rad CFX real-time system using iTaq Universal SYBR 
Green Supermix (Bio-Rad). B-Actin was used as an internal control to 
normalize the data across different samples. Primers used for qPCR 
were as follows: Foxp3 forward (-F), 5’-ACTGGGGTCTTCTCCCTCAA-3’; 
Foxp3 reverse (-R), 5’-CGTGGGAAGGTGCAGAGTAG-3’; Actb-F, 5’-CGCC 
ACCAGTTCGCCATGGA-3’; Actb-R, 5’-TACAGCCCGGGGAGCATCGT-3’. 


Metabolic assays 

In vitro differentiated cells were cultured in the presence of DMSO 
or isoalloLCA for 48 h, and washed extensively before the assay. The 
oxygen consumption rate was determined using a Seahorse XF96 
Extracellular Flux Analyzer (Seahorse Bioscience) following protocols 
recommended by the manufacturer and according to the previously 
published method™. In brief, cells were seeded on XF96 microplates 
(150,000 cells per well) that had been pre-coated with poly-D-lysine 
(Sigma) to immobilize cells. Cells were maintained in XF medium ina 
non-CO, incubator for 30 min before the assay. The Mito stress test kit 
(Agilent) was used to test the oxygen consumption rate by sequential 
injection of 1 pM oligomycin, 1.5 uM FCCP (carbonyl cyanide 4-(trif- 
luoromethoxy)phenylhydrazone) and 0.5 uM rotenone or antimycin 
A. Data were analysed by wave software (Agilent). 


Chromatin immunoprecipitations 

Chromatin immunoprecipitation (ChIP) assays were performed 
according to a standard protocol. In brief, naive CD4* T cells were 
cultured for 48 h, and fixed for 10 min with 1% formaldehyde. Then, 
0.125 M glycine was added to quench the formaldehyde. Cells were 
lysed, and chromatin was collected and fragmented by sonication 
at a concentration of 10’ cells per ChIP sample. Chromatin was 
immunoprecipitated with 5 pg of ChIP or IgG control antibodies at 
4°C overnight and incubated with protein G magnetic beads (Ther- 
moFisher) at 4 °C for 2h, washed and eluted in 150 pl elution buffer. 
Eluate DNA and input DNA were incubated at 65 °C to reverse the 
crosslinking. After digestion with proteinase K, DNA was purified 
with the QlAquick PCR purification kit (Qiagen). The relative abun- 
dance of precipitated DNA fragments was analysed by qPCR using 
SYBR Green Supermix (Bio-Rad). The primers used were as follows: 
Foxp3 promoter-F, 5’-TAATGTGGCAGT T TCCCACAAGCC-3’; Foxp3 
promoter-R, 5’-AATACCTCTCTGCCACTT TCGCCA-3’; Foxp3 CNS1-F, 
5’-AGACTGTCTGGAACAACCTAGCCT-3’; Foxp3 CNSI-R, 5’-TGGAGGT 
ACAGAGAGGT TAAGAGCCT-3’; Foxp3 CNS2-F, 5’-ATCTGGCCAAGTTCA 
GGTTGTGAC-3’; Foxp3CNS2-R,5’-GGGCGTTCCTGTTTGACTGTTTCT-3; 


Foxp3 CNS3-F, 5’-TCTCCAGGCT TCAGAGATTCAAGG-3’; Foxp3 CNS 
3-R, 5’-ACAGTGGGATGAGGATACATGGCT-3’; Foxp3 ex10-F, 5’-CT 
GCATCGTAGCCACCAGTA-3’; Foxp3 ex10-R, 5’-AACTAT TGCCAT 
GGCTTCC-3’; Hsp90ab1-F, 5’-TTACCT TGACGGGAAAGCCGAGTA-3’; 
Hsp90ab1-R, 5’-TTCGGGAGCTCTCTTGAGTCACC-3’. 


Isolation of lamina proprialymphocytes 

Gut tissues were collected and treated with 1 mM DTT at room tem- 
perature for 10 min, and 5 mM EDTA at 37 °C for 20 min to remove 
epithelial cells, and dissociated in digestion buffer (RPMI, 1 mg/ml 
collagenase type VIII, 100 pg/ml DNase I and 5% FBS) with constant 
stirring at 37 °C for 30 min. Mononuclear cells were collected at the 
interface of a40%-80% Percoll gradient (GE Healthcare). Cells were then 
analysed by flowcytometry. The distal one-third of the small intestines 
was considered the ileum. 


Mouse experiments 

For bile-acid feeding experiments, the standard mouse diet in ground 
meal format (PicoLab Diet, no. 5053) was evenly mixed with a meas- 
ured amount of bile acid compounds and provided in glass feeder 
jars and replenished when necessary. Bile-acid feeding experiments 
were performed for seven days or less as extended feeding sometimes 
resulted in reduced food consumption and weight loss. Colonization 
of mice with SFB was done with fresh faecal samples, derived from 
1123r’ Rag2’ double-knockout mice that are known to carry much 
higher levels of SFB compared to conventional B6 mice. Faecal sam- 
ples were homogenized in water using a 70-um cell strainer and a5-ml 
syringe plunger. Supernatant was introduced into mice using a 20G 
gavage needle at 250 pl per mouse, approximately equal to the amount 
of 1/4 mouse faecal pellets. Successful colonization was assessed by 
qPCR, using the following primers: SFB-F, 5’-GACGCTGAGGCATGAGAG 
CAT-3’; SFB-R, 5’-GACGGCACGAATTGTTATTCA-3’; universal 16S-F, 
5’-ACTCCTACGGGAGGCAGCAGT-3’; universal 16S-R, 5’-ATTACCGCGG 
CTGCTGGC-3’. For the C. rodentium infection experiment, age- and 
sex-matched germ-free mice were orally infected with approximately 
1x 10° colony-forming units of C. rodentium and killed for analysis at 
6 days after infection. Mice were kept in IsoCage system (Tecniplast) 
and fed an autoclaved diet with or without 0.3% 3-oxoLCA (w/w) dur- 
ing the experiment. 


Bone marrow transfer 

Bone marrow cells were isolated from the femur and tibia of B6 
(CD45.1) mice or of CNS3-knockout mice (CD45.2). Red blood cells 
were removed by using an ammonium-chloride-potassium lysing 
buffer. The two populations were mixed at a 1:1 ratio and a total of 
1x10’ cells were transferred into each irradiated (1,000 rad) CD45.1 
mouse (5 weeks old) by retro-orbital injection. Sulfamethoxazole- 
trimethoprim (240 mg in 250 ml drinking water) was provided for 2 
weeks after irradiation. 


Adoptive transfer colitis 

CD45RB"*" adoptive transfer colitis was performed as previously 
described”. In brief, isolated CD4*CD25°CD45RB"*" naive T cells 
were sorted from wild-type B6 (CD45.1) mice by FACS and 0.5 million 
cells were adoptively transferred into each RAGI-knockout recipient 
mouse. In addition, the same number of in vitro cultured and sort- 
purified CD45.2* FOXP3-GFP* cells were transferred into the recipient 
mice. Naive CD4 T cells, isolated from CD45.2 FOXP3-IRES-—GFP mice, 
were cultured under TGFB'™ (0.05 ng/ml TGFB), isoalloLCA (20 1M 
isoalloLCA and 0.01 ng/ml TGFB) or TGFB"" (1 ng/ml TGFR) condi- 
tions. Mice were then monitored and weighed each week. At week 8, 
colon tissues were collected, and lamina propria lymphocytes were 
analysed by flow cytometry. Haematoxylin and eosin (H & E) staining 
and disease scoring were performed by the Rodent Histopathology 
Core at Harvard Medical School. 


Isolation of faecal bacterial microbiota and 16S rRNA gene 
sequencing analysis 

DNA of the mouse faecal microbiota was isolated by using QlAamp 
Fast DNA Stool Mini Kit (Qiagen) according to the manufacturer’s 
instructions. The samples were quantified using an Agilent 4200 
Tapestation instrument, with corresponding Agilent Genomic DNA 
ScreenTape assays. The samples were then normalized to 12.5 ng of 
inputin 2.5 pl (Sng/pl), and amplified using IDT primers specific tothe 
V3 and V4 region: forward 5’-TCGTCGGCAGCGTCAGATGTGTATAAG 
AGACAGCCTACGGGNGGCWGCAG-3’, reverse 5’- GICTCGTGGGCTCGG 
AGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC:3’. The 
amplification was done using the KAPA HiFi HotStart Ready Mix (2x) 
(Roche Sequencing Solutions). Residual primers were eluted away using 
Aline PCRCLean DX beads in a 0.8 SPRI-based cleanup. The purified 
amplicons were then ligated with indexing adapters using IIlumina’s 
Nextera XT Index Primers. Following this step, a final clean-up was 
performed using Aline PCRClean DX beads. The resulting purified 
libraries were run on an Agilent 4200 Tapestation instrument, with a 
corresponding Agilent High Sensitivity DIOOO ScreenTape assay to 
visualize the libraries and check that the size of the library matched the 
expected about 630-bp product. Concentrations obtained from this 
assay were used to normalize all samples in equimolar ratio. The pool 
was denatured and loaded onto an Illumina MiSeq instrument, with an 
Illumina MiSeq V3 600-cycle kit to obtain paired-end 300-bp reads. The 
pool was loaded at 10.5 pM, with 50% PhiX spiked in to compensate for 
low base diversity. The basecall files were demultiplexed through the 
BioPolymer Facility's pipeline, and the resulting FASTQ files were used 
in subsequent analysis. Raw fastq sequences were then quality-filtered 
and analysed by following QIIME2 version 2018.11 and DADA21.6.0%*°. 
Operational taxonomic units were picked with 97% sequence similarity. 
The phylogenetic affiliation of each operational taxonomic unit was 
aligned to the Greengenes reference database version 13_8 and 99% ID. 


Statistical analyses 
Statistical analysis tests were performed with Prism v.8.0.2 (GraphPad). 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The 16S rDNA datasets are available through NCBI under accession 
number PRJNA528994. Source Data for Figs. 1-4 and Extended Data 
Figs. 2-9 are provided with the paper. Any other relevant data are avail- 
able from the corresponding authors upon reasonable request. 
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Extended Data Fig. 1| Chemical structures of bile acid derivatives. These derivatives were used for the T cell differentiation assay. 


a Gating for in vitro culture 


FSC-W 


‘| Live Cells 
83.5 


a-CD3/28; -CD3/28; 
IL-6+TGF-B IL-2+TGF-B 
<— @16hr 


| 


add compounds 


e) 
@ IL-17a 
e) 


@ Day3 
FoxP3 <— assay by FACS 


Extended Data Fig. 2|3-OxoLCA and isoalloLCA affect T,,17 and T,,, cell 
differentiation. a, b, Gating strategy for the flowcytometric analyses of 
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B6Jax mice (n=2 biologically independent samples) were cultured under T,,17 


ail ae 


a IL17a+ 4 - 
pee 22.7 ‘“ 
3 ra 
LU < oe 
a, ae 
S.» XK.» 
™ CD4-eFluor450 
KR” ILtza+ 
>: 7.33 
Q «4 
a rye 
Oo 
&. : 
7 aBTCR+ ae 
4 19.9 pou : = aa a aaa 
ao eo IFNy-PerCP-Cy5.5> 
ySTCR+ oO : 
1.85 oo 10! 
| . <: FoxP3+ 
b> ee 6.16 
ee ae we a aa eae 
TCRyd-PE ——> 2 CD8-eFluor450-» 


FoxP3-APC—> 


tii 2 


: 


d CD4-APC-Cy7> 


Th17 


% IL17a 


e 
0 Odt<et dt dtdtdtetititdtetetitetett iterate 
(PLOLOLSISLSISLSISLSLSLS ISLS lal SSL SS [SIS [SOLS (SL SISLSIS[SIS} 
a 0 PSEIOSSLSERBSSISOQYLXGT=EVLO LV = 
OF SESZSESHNRAKGOP SLBRGQO® 
SZOFOF Ss “GAM 2F OMG 
OF (0) nT o 58 
Og 
SZ 
e 1 
To 
Treg a) 
100 
e 
~» 80 ‘ 
a Besse settee age FEE FECT O4 
x 60 o (eh See SF st 
LL 
se 40 
20 
0 Oddtdtddtdtdtdtdtdtiteteetetedteetetitetetetetg 
(PIOLS ISLS TSTSLSISLSLSLSL SIS al SL SISLS (SIS LSISOL SSIS ISIS ISIS) 
S 86-768 96068635099 ena sts 400naas4 
BD SC PS3SV8EBER SS SY LR WHY LO LOLS 
OF OS2SZFBIMNAKOP SSexXooo® 
ZzgOFROF> TONG GBT OMGE 
OF oO név o 59 
Oy 
tS 
Ox 
x6 
om 
oO 


(IL-6 =10 ng mI"; TGFB = 0.5 ng mI”) (d) and T,., (IL-2=100 U mI"; 

TGFB=0.1ng mI") (e) cell polarization conditions for 3 days. DMSO or various 
bile acids at 20 uM concentration were added to the cell cultures on day 1. Data 
are mean. 
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Extended Data Fig. 3 | IsoalloLCA-induced T,,, cell expansion requires TGFB. 
a-c, Flowcytometry and histogram of CD4' T cells, cultured for 3 days with 
different amounts of TGF (1, 0.1, 0.01 or Ong mI) and IL-2 (100 U mI") inthe 
presence of DMSO or isoalloLCA (20 pM) and intracellularly stained for FOXP3 
(n=3 biologically independent samples per group). d,e, Flowcytometry of 
CD4* T cells, cultured for 3 days in the presence of DMSO, isoalloLCA (20 1M) or 
TGFB (0.05 ng mI°). In addition, anti-TGFB antibody (10 pg mI, 1D11) or isotype 
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group). f-h, 3-OxoLCA and isoalloLCA do not affect key transcription factor 
expression. T cells were cultured under T,,0, T,,1, T,,2 or T,,17 conditions, in the 
presence of DMSO, 3-oxoLCA (20 pM) or isoalloLCA (20 pM). T-cell-lineage- 
determining transcription factors suchas T-bet, GATA3 or RORyt were 
intracellularly stained (n =3 biologically independent samples per group). MFI, 
mean fluorescence intensity. Data are mean +s.d., by unpaired t-test with two- 
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Extended Data Fig. 4 | Effects ofisoalloLCA on FOXP3 expression require 
strong TCR stimulation. a, 3-OxoLCA and isoalloLCA demonstrate dose- 
dependent effects on T,,17 cellandT,,, cell differentiation, respectively 

(n=2 biologically independent samples). Alow concentration of TGFB 

(0.01 ng mI”) was used for T,., cell culture. b-d, 3-OxoLCA and isoalloLCA do 
not significantly affect cell proliferation, cell viability or T cell activation. 

b, Naive CD4* T cells were labelled with a cell proliferation dye CFSE and 
cultured for 3 days in the presence of DMSO, 3-oxoLCA or isoalloLCA under 
T,,17 or T,,, cell polarization conditions. c, Live-cell percentages at the end of 
the 3-day culture were determined based on both annexin V and fixable live/ 


dead staining (n =3 biologically independent samples per group).d, Both 
DMSO and isoalloLCA treatment lead to comparable levels of expression of 
CD25, CD69, NUR77 and CD44. Naive CD4" T cells were used as a negative 
control. e, f, T cells were cultured with different concentrations of anti-CD3 
antibody, inthe presence of DMSO or isoalloLCA (20 pM). Representative FACS 
plots of CD4* T cells cultured for 3 days and stained intracellularly for FOXP3 
(e). Quantification of FOXP3* and viable T cells after 3-day culture (f) 

(n=2 biologically independent samples per group). Data are representative of 
two independent experiments (b, d). Datainc are meants.d. 
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Extended Data Fig. 5 | See next page for caption. 


Extended Data Fig. 5|REL, VDRand FXRare dispensable for isoalloLCA- 
dependent induction of FOXP3.a, b, In vitro suppression assay. CD4* effector 
T cells (Tony) were labelled with CFSE and mixed with DMSO- or isoalloLCA- 
treated T,.. cells (tester) at different ratios (n=2 biologically independent 
samples per group). c, Expression of GFP in DMSO- or isoalloLCA-treated T cells 
cultured with anti-CD3/28, IL-2 and TGFB (0.01ng mI’). Naive CD4* T cells were 
isolated from FOXP3-IRES-GFP mice. d, Flowcytometry of CD4"T cells stained 
intracellularly for FOXP3. Naive CD4*T cells isolated from wild-type, CNS1-, 
CNS2- or CNS3-knockout mice (n =3 biologically independent samples per 
group) were cultured with anti-CD3/28 and IL-2, LCA (20 pM), TGFB 

(0.05 ng mI?) and additional retinoic acid (Ing mI”). e, f, Flow cytometry (e) and 
its quantification (f) of CD4* T cells stained intracellularly for FOXP3. Naive 
CD4' T cells were isolated from wild-type control mice or REL-knockout mice 
(n=4 biologically independent samples per group) and cultured with anti- 
CD3/28 and IL-2 in the presence of DMSO, isoalloLCA (20 pM) or LCA (20 pM). 


g,h, Naive CD4’ T cells isolated from wild-type control, VDR-knockout or FXR- 
knockout (n=2 biologically independent samples per group) were cultured 
with anti-CD3/28 and IL-2 (g) or anti-CD3/28, IL-6 and TGF (h) for 3 days inthe 
presence of DMSO, isoalloLCA (20 1M), or 3-oxoLCA (20 pM). Representative 
FACS plots of T cells intracellularly stained for FOXP3 or IL-17a. i, Chemical 
structures of glycine-conjugated 3-oxoLCA (glyco-3-oxoLCA) and isoalloLCA 
(glyco-isoalloLCA).j and k, Quantifications of T,,17 (j) and T,.g (k) cell 
differentiation in vitro. T cells were cultured with anti-CD3/28, IL-6 and TGFP (j) 
or anti-CD3/28 and IL-2 (k) inthe presence of DMSO, 3-oxoLCA (20 pM), glyco-3- 
OXxOLCA (20 EM), isoalloLCA (5 or 20 pM) or glyco-isoalloLCA (5, 10 or 20 pM). 
Glyco-isoalloLCA exhibited enhanced cytotoxicity at 10 or 20 1M compared to 
isoalloLCA (n=3 biologically independent samples per group). Data are 
representative of two independent experiments (c,d). Dataare mean +s.d., by 
unpaired t-test with two-tailed Pvalue. 
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Extended Data Fig. 6 | See next page for caption. 
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Extended Data Fig. 6 | IsoalloLCA-dependent FOXP3 transcription requires 
mitoROS and H3K27ac. a-c, ChIP analysis of H3K27ac, p300 and H3K4 mono- 
methylation (H3K4mel) onthe Foxp3 gene locus. Chromatin obtained from 
DMSO- and isoalloLCA-treated wild-type cells were immunoprecipitated with 
IgG, anti-H3K27ac, anti-p300 or anti-H3K4mel1 antibodies, followed by real- 
time PCR analysis (n= 3 biologically independent samples per group). Primers 
targeting Foxp3 promoter (Pro), CNS1, CNS2 and CNS3 region and Hsp90ab1 
promoter were used for qPCR quantification. Relative enrichment was 
calculated as fold change relative to the ChIP signal at the Foxp3 promoter of 
the DMSO-treated control. d,e, Flowcytometry and quantification of CD4* 
Tcells stained intracellularly for FOXP3. Naive CD4* T cells isolated from wild- 
type mice (n=2 biologically independent samples per group) were cultured 
with anti-CD3/28, IL-2 and TGFB (0.05 ng mI") in the presence of DMSO or 
isoalloLCA (20 pM) inthe presence or absence of iBET. f, ChIP analysis of 
H3K27ac on the Foxp3 promoter region. Naive CD4* T cells isolated from wild- 
type or CNS3-knockout mice (n= 3 biologically independent samples per 
group) were treated with DMSO or isoalloLCA (20 uM). g, Seahorse analysis of 
oxygen consumption rate (OCR) with naive CD4* T cells isolated from wild-type 
or CNS3-knockout mice cultured with anti-CD3/28 and IL-2 for 48 h, inthe 
presence of DMSO or isoalloLCA (20 pM). Measurements from six wells from 
two mice for each genotype. h-k, T cells were cultured with DMSO, LCA, 
isoLCA, alloLCA, isoalloLCA or 3-oxoLCA at 20 pM for 48 h. Their mitochondrial 
and cytoplasmic ROS were measured by mitoSOX (h) and 
2’,7’-dichlorofluorescein diacetate (DCFDA) (i), respectively. Total 
mitochondria mass was measured by MitoTracker (j) and the mitochondrial 
membrane potential measured by JC-1 dye (k). Mean fluorescence intensities of 


different treatments were normalized as fold changes of those of the DMSO 
control (n=3 biologically independent samples per group).1, MitoROS 
production measured by mitoSOX with T cells cultured with DMSO, isoalloLCA 
(20 uM), retinoic acid (1nM), or isoalloLCA (20 1M) + mitoQ (0.5 LM) for 48h. 
m, ChIP analysis (n =3 biologically independent samples per group) of H3K27ac 
onthe Foxp3 promoter of T cells, treated with DMSO, isoalloLCA, isoalloLCA + 
mitoQ or isoalloLCA + anti-TGFB for 72 h. n-q, MitoROS production measured 
by mitoSOX withT cells cultured with different concentrations of anti-CD3 and 
treated with DMSO, isoalloLCA (20 1M), TGFB (0.05 ng mI“) or isoalloLCA plus 
TGF (n=2 biologically independent samples per group) (n); or with T cells 
treated with DMSO or isoalloLCA (20 pM) plus anisotype control or anti-TGFB 
antibody (n=4 biologically independent samples per group) (0); or with T cells 
cultured under Tyl, Ty,2, Ty17 or Treg cell conditions (n =3 biologically 
independent samples per group) (p); or with naive CD4* T cells isolated from 
wild-type or CNS3-knockout mice and cultured with anti-CD3/28 and IL-2 (n=3 
biologically independent samples per group) (q).r, MitoROS production 
measured by mitoSOX with T cells cultured with DMSO or mitoPQ (5 pM) for 
48 h.s, Dose-dependent effects of mitoPQ onT,,, cell differentiation (n=3 
biologically independent samples per group). t, Quantification of T,., cell 
differentiation in vitro on naive CD4* T cells cultured inthe presence of DMSO 
or mitoPQ (5 1M) and treated withisotype control or anti-TGFB antibody 

(n=3 biologically independent samples per group). u, A model showing the 
mechanism of isoalloLCA enhancement of T,,, cell differentiation. Data are 
representative of two independent experiments (I, r) and shown as mean +s.d., 
by unpaired t-test with two-tailed Pvalue. 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7 | 3-OxoLCA inhibits the differentiation of T,,17 cells but 
not T,,, cells, and isoalloLCA alone does not enhance T,,, cell differentiation 
in vivo.a, UPLC-MS spectra of LCA and its isomers isoalloLCA, alloLCA, and 
isoLCA, as well as 3-oxoLCA. b, Quantification of unconjugated LCA and its 
derivatives in the caecal contents of B6 Tac mice fed onacontrol or bile-acid- 
containing diet (n=7,5 and 4 mice for control (ctrl), 3-oxoLCA and 

3-oxoLCA + isoalloLCA, respectively). c, Quantification of unconjugated 
3-oxoLCA and isoalloLCA in human stool samples from patients with ulcerative 
colitis (n=16 donors). d, Quantification of unconjugated 3-oxoLCA, isoalloLCA 
andLCA in mouse caecal contents from germ-free (GF) or conventionally 
housed (CNV) mice (n=15 mice per group). e, B6 Jax mice gavaged with SFB. SFB 
colonization measured by qPCR analysis calculated as copy number (n=5 mice 
per group). f, Diagram showing experimental design. B6 Tac mice were fed a 
3-oxoLCA (0.3%)-containing diet for 7 days. g, SFB colonization measured by 
qPCR analysis calculated as SFB copy number (n=5 mice per group). h, i, Flow 
cytometric analysis and quantification of T,,17 (h) and T,,¢ (i) cells of the ileal 
lamina propria (n=7 mice per group).j-I, Experimental scheme of anti-CD3 
experiment with 3-oxoLCA (j). Flow cytometric analysis and quantification of 


CD4* cells of the lamina propria following an anti-CD3 injection from B6 Tac 
mice fed with control or 3-oxoLCA (0.3%) diet (n=9 mice per group) (k), or 
3-oxoLCA (1%) diet (n=7 mice per group) (1).m,n, Flowcytometric analysis and 
quantification of CD4* cells of the ileal lamina propriain steady-state (m) 

(n=6 mice per group) or following an anti-CD3 injection (n) (n=5 mice per 
group). B6 Tac mice were fed with control or isoalloLCA (0.03%) diet. o, p, Flow 
cytometry (o) and quantification (p) of CD4* T cells stained intracellularly for 
FOXP3, showing that the combination of 3-oxoLCA and isoalloLCA further 
increases T,,, cell differentiation. Naive CD4* T cells isolated from wild-type B6 
mice (n=3 biologically independent samples) treated with DMSO, isoalloLCA 
(20 uM), a mixture of 3-oxoLCA (20 tM) and isoalloLCA (20 pM) ora mixture of 
3-0xoCA (20 LM) and isoalloLCA (20 pM) and cultured with anti-CD3/28 and IL- 
2, with or without the addition of IL-6 (62.5 pg ml”). q, MitoROS productionin 
total CD4*T cells isolated from the ileal lamina propria. Mice were fed acontrol 
diet or diet containing a mixture of 3-oxoLCA (0.3%) + isoalloLCA (0.03%) (n=9 
or 10 mice, respectively) and injected with 10 pg of anti-CD3 to induce 
inflammation. Data are mean +s.d., by unpaired ¢-test with two-tailed Pvalue. 
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Extended Data Fig. 8 |3-OxoLCA or isoalloLCA does not significantly alter 
gut microbiota. a, Box plot showing operational taxonomic unit (OTU) 
numbers. b, Shannon diversity of faecal microbiota based on 16S rRNA gene 
amplicon sequencing. For the box plots ina, b, the three horizontal lines of the 
box represent the third quartile, median and first quartile, respectively, from 
top to bottom. The whiskers above and below the box show the maximum and 
minimum.c, Principal coordinates analysis based on weighted UniFrac 
distances of 16S rRNA amplicon sequencing of faecal microbiota. d, e, Average 
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relative abundance of microbiota at the phylum (d) and the family (e) levels by 
taxon-based analyses (n=4,5and5 mice for the control, 3-oxoLCA and 
isoalloLCA groups, respectively). f, g, Experimental scheme (f) and flow 
cytometric analysis and quantification (g) of CD4* cells of the lamina propria of 
the colonin germ-free B6 mice, infected with C. rodentium. Mice were fed an 
autoclaved diet with or without 3-oxoLCA (0.3%) (n=9 mice per group). Data 
aremeants.d., by unpaired ¢-test with two-tailed Pvalue. 
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Extended Data Fig. 9 | IsoalloLCA-induced T,,, cells suppress transfer colitis. 
a, Experimental scheme. Ragi” recipient mice were transferred 
intraperitoneally with 0.5 million CD45RB"®" naive CD4*T cells (CD45.1) and 
with or without co-transfer of 0.5 million FOXP3-GFP’T,,, cells (CD45.2). 
FOXP3-GFP* cells were cultured under TGFB'” (0.05 ng ml”), isoalloLCA 

(20 pM, 0.01ng mI“ TGFB) and TGFB"#" (1ng mI“) conditions with GFP” naive 
CD4T cells, isolated from CD45.2 FOXP3-IRES-GFP mice. b, Flowcytometric 
analysis of the FOXP3-GFP’ cells, following in vitro culture. The gated cells 
were sorted and used for co-transfer. c-f, Weight change monitored for 


CD45.2-FITC —> 


CD4-APC-Cy7 —> 


8 weeks; week-7 values are used for unpaired t-test with two-tailed P value (c) 
(n=S mice per group). At the end of the experiment, colon length (d) 

(n=10 mice per group), H & E staining (e) and the quantification of disease 
score (f) (n=5 mice for ‘none’, 4 mice for other groups). g-j, Flow cytometric 
analysis and quantification of the frequency of CD45.1and CD45.2 (g, i) andthe 
frequency of FOXP3* cells in the CD45.2 population (h,j) ineach condition 
(n=S mice per group). k, Quantification of total CD45.1 cell number in the 
lamina propria of the colon (n=5 mice per group). Data are mean +s.d., by 
unpaired ¢-test with two-tailed Pvalue. 
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Extended Data Table 1| Lipophilicity of bile acids 


bile acid 


lithocholic acid 
isolithocholic acid 
allolithocholic acid 
isoallolithocholic acid 
3-oxolithocholic acid 
deoxycholic acid 
chenodeoxycholic acid 
ursodeoxycholic acid 
obeticholic acid 

cholic acid 


abbreviation 


LCA 
isoLCA 
alloLCA 
isoalloLCA 
3-oxoLCA 
DCA 
CDCA 
UDCA 
OCA 

CA 


log D 
(pH = 8.0) 
3.6 
a3 
3. 
22 
2.4 
pee 
252 
Ded 
23 
1.1 


reference 


This study 
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Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


l The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


O A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


O For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection BD FACS DIVA software V8.0.1. 


Data analysis FlowJo V9.9.3, V10.6.0 software(TreeStar), Wave V2.6.1 (Agilent), GraphPadPrism V7 (GraphPad Software), RStudio V1.2.1335 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


16S rDNA datasets analyzed in the manuscript are available through NCBI under accession number PRJNA528994. 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


x] Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No statistical methods were used to predetermine sample size. Sample sizes were determined by magnitude and consistency of measurable 
differences. The precise number of animals used were indicated in the figure legends. 


Data exclusions | No data were excluded from analyses. 
Replication Experiments were repeated, so our data represent at lease two to three independent experiments with similar results. 
Randomization — Mice used in the in vivo testing of bile acids were randomly assigned to experimental groups. 


Blinding Investigators were not blinded during group allocation and data analysis. Investigators were blinded for disease scoring when performing 
transfer T cell colitis experimental analyses. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used Flow cytometry antibodies are purchased either from eBioscience: anti-IFNy (XMG1.2; #48-7311-82; Lot:1991937; 1:200); anti- 
L-17a (eBio17B7; #25-7177-82; Lot:1994058; 1:200); anti-FoxP3 (FJK-16s; #11-5773-82; Lot:2007700; 1:100); anti-CD4 (RM4-5; 
#48-0042-82; Lot:1967921; 1:200); anti-CD3e (145-2C11; #48-0031-82; Lot:4311331; 1:200); anti-CD25 (PC61.5; #25-0251-82; 
Lot: E07536-1635; 1:200); anti-CD69 (H1.2F3; #45-0691-82; Lot:E08349-1633; 1:200); anti-CD62L (MEL-14; #11-0621-85; 
Lot:E00377-1631; 1:200); anti-Nur77(12.14; #53-5965-82; Lot:4347883; 1:100), or from Biolegend: anti-IL-4 (11B11; #504104; 
Lot:B271497; 1:200); anti-CD45 (30-F11; #103114; Lot:B247440; 1:200); CD45RB (C363-16A; #103308; Lot:4108938; 1:200); 
anti-CD44 (IM7; #103032; Lot: B238172; 1:200). Specific fluorochrome color are labeled in the figures. ChIP antibodies: anti- 
Rabbit IgG (#ab46540; 1:100); anti-H3K27Ac (#ab4729; Lot:GR286678-2; 1:100); anti-H3K4me1 (#ab8895; Lot:GR283603-1; 
1:100); anti-P300 (#ab14984; Lot:GR272730-1; 1:100). 


Validation For eBioscience antibodies and Biolegend FACS antibodies: anti-IFNy (XMG1.2; #48-7311-82; Lot:1991937; 1:200); anti-IL-17a 
(eBio17B7; #25-7177-82; Lot:1994058; 1:200); anti-FoxP3 (FJK-16s; #11-5773-82; Lot:2007700; 1:100); anti-CD4 (RM4-5; 
#48-0042-82; Lot:1967921; 1:200); anti-CD3e (145-2C11; #48-0031-82; Lot:4311331; 1:200); anti-CD25 (PC61.5; #25-0251-82; 
Lot: E07536-1635; 1:200); anti-CD69 (H1.2F3; #45-0691-82; Lot:E08349-1633; 1:200); anti-CD62L (MEL-14; #11-0621-85; 
Lot:E00377-1631; 1:200); anti-Nur77(12.14; #53-5965-82; Lot:4347883; 1:100); anti-IL-4 (11B11; #504104; Lot:B271497; 1:200); 
anti-CD45 (30-F11; #103114; Lot:B247440; 1:200); CD45RB (C363-16A; #103308; Lot:4108938; 1:200); anti-CD44 (IM7; #103032; 
Lot: B238172; 1:200), manufacturer provided technical data sheets and validated antibodies for flow cytometry individually. 


For ChIP antibodies: anti-Rabbit IgG (#ab46540; 1:100); anti-H3K27Ac (#ab4729; Lot:GR286678-2; 1:100); anti-H3K4me1 
(#ab8895; Lot:GR283603-1; 1:100); anti-P300 (#ab14984; Lot:GR272730-1; 1:100) abcam validated each batch of antibody for 
ChIP analysis on its website. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) 293 cell line was obtained from ATCC 


Authentication 293 cell line was not authenticated 


=) 
je’) 
=e 
S 
= 
a) 
= 
a) 
Za) 
a) 
fed) 
= 
(a) 
=F 
= 
io) 
12) 
2) 
a 
=} 
© 
Wn 
S 
3 
je’) 
S 
=< 


Mycoplasma contamination Cell line was not tested for mycoplasma 


Commonly misidentified lines cell line was not listed in the ICLAC 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals C57BL/6 (stock no. 000664), FoxP3-GFP (stock no. 016958), FXR-KO (stock no. 004144), VDR-KO (stock no. 006133), CD45.1 


(stock no. 002014), PhaMexcised (stock no. 018397), Rag1-KO (stock no. 002216) mice were purchased from Jackson Laboratory. 


SFB containing C57BL/6 mice were purchased from Taconic bioscience. FoxP3-CNS-KO and control mice were provided by Ye 
Zheng Lab at Salk Institute. Both male and female mice were used in the study age range from 6-10 week old. 


Wild animals This study did not involve the use of wild animals. 
Field-collected samples This study did not involve the use of the field-collected samples. 
Ethics oversight All mouse studies were performed in full compliance with IACUC approved protocol and guidelines of Harvard Medical School. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics This study uses fecal samples collected from patients with biopsy-proven active ulcerative colitis. 
Recruitment Eligible patients were identified by the treating physician at Weill Cornell Medicine. 


Ethics oversight Fecal samples were obtained from patients with active ulcerative colitis under an Institutional Review Board-approved protocol 
and informed consent was obtained at Weill Cornell Medicine IRB 1404014982. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Flow Cytometry 


Plots 


Confirm that: 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Methodology 
Sample preparation This information was included in the Methods section. 
Instrument LSRII analyzer (BD Biosciences), Aria II (BD Biosciences) 
Software FACS DIVA software V8.0.1.(BD Biosciences) and FlowJo V9.9.3 software 


(TreeStar) 
Cell population abundance _ Sort-purification was carried out using FACSAria cell sorter (BD Biosceinces), with >98% purity. 


Gating strategy Examples for the gating strategy were presented in Extended Data Fig.1 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Andrew V. Anzalone'”*, Peyton B. Randolph’, Jessie R. Davis'”*, Alexander A. Sousa’”*, 
Luke W. Koblan'”’, Jonathan M. Levy’, Peter J. Chen'”*, Christopher Wilson'%, 
Gregory A. Newby'”*, Aditya Raguram'”* & David R. Liu’?** 


Most genetic variants that contribute to disease’ are challenging to correct efficiently 
and without excess byproducts? °. Here we describe prime editing, a versatile and 
precise genome editing method that directly writes new genetic information intoa 
specified DNA site using a catalytically impaired Cas9 endonuclease fused to an 
engineered reverse transcriptase, programmed witha prime editing guide RNA 
(pegRNA) that both specifies the target site and encodes the desired edit. We 
performed more than 175 edits in human cells, including targeted insertions, 
deletions, and all 12 types of point mutation, without requiring double-strand breaks 
or donor DNA templates. We used prime editing in human cells to correct, efficiently 
and with few byproducts, the primary genetic causes of sickle cell disease (requiring a 
transversion in HBB) and Tay-Sachs disease (requiring a deletion in HEXA); to installa 
protective transversion in PRNP; and to insert various tags and epitopes precisely into 
target loci. Four human cell lines and primary post-mitotic mouse cortical neurons 
support prime editing with varying efficiencies. Prime editing shows higher or similar 
efficiency and fewer byproducts than homology-directed repair, has complementary 


strengths and weaknesses compared to base editing, and induces much lower off- 
target editing than Cas9 nuclease at known Cas9 off-target sites. Prime editing 
substantially expands the scope and capabilities of genome editing, and in principle 
could correct up to 89% of known genetic variants associated with human diseases. 


The ability to make virtually any targeted change in the genome of any 
living cell or organism is a longstanding aspiration of the life sciences. 
Despite rapid advances in genome editing technologies, the majority 
of the more than 75,000 known disease-associated genetic variants in 
humans! remain difficult to correct or install in most therapeutically rel- 
evant cell types (Fig. 1a). Programmable nucleases such as CRISPR-Cas9 
make double-strand DNA breaks (DSBs) that can disrupt genes by induc- 
ing mixtures of insertions and deletions (indels) at target sites? +. DSBs, 
however, are associated with undesired outcomes, including complex 
mixtures of products, translocations’, and activation of p53°’. Moreover, 
the vast majority of pathogenic alleles arise from specific insertions, dele- 
tions, or base substitutions that require more precise editing technolo- 
gies tocorrect (Fig. 1a, Supplementary Discussion). Homology-directed 
repair (HDR) stimulated by DSBs° has been widely used to install precise 
DNA changes. HDR, however, relies on exogenous donor DNA repair tem- 
plates, typically generates an excess of indels from end-joining repair of 
DSBs, andis inefficient in most therapeutically relevant cell types (T cells 
and some types of stem cell being important exceptions)”"°. Whereas 
enhancing the efficiency and precision of DSB-mediated editing remains 
the focus of promising efforts” ©, these challenges motivate the explora- 
tion of alternative precision genome editing strategies. 

Base editing can efficiently install the four transition mutations (C>T, 
G>A, A°G, and T>C) without requiring DSBs in many cell types and 


organisms, including mammals’*”, but cannot currently perform the 
eight transversion mutations (C>A, C>G, G°C, GT, A°C, A>T, T>A, 
and T>G), suchas the TeA-to-AeT mutation needed to directly correct 
the most common cause of sickle cell disease (HBB(E6V)). In addition, 
no DSB-free method has been reported to perform targeted deletions, 
suchas the removal of the four-base duplication that causes Tay-Sachs 
disease (HEXA””*™'°), or targeted insertions, such as the three-base 
insertion required to directly correct the most common cause of cystic 
fibrosis (CFTR(AF508)). Targeted transversions, insertions, and dele- 
tions are therefore difficult to install or correct efficiently and without 
excess byproducts in most cell types, even though they collectively 
account for most known pathogenic alleles (Fig. 1a). 

Here we describe the development of prime editing, a ‘search-and- 
replace’ genome editing technology that mediates targeted insertions, 
deletions, all 12 possible base-to-base conversions, and combinations 
thereof in human cells without requiring DSBs or donor DNA templates. 
Prime editors (PEs), initially exemplified by PE1, use areverse transcriptase 
(RT) fused to an RNA-programmable nickase and a prime editing guide 
RNA (pegRNA) to copy genetic information directly from an extension 
onthe pegRNA into the target genomic locus. PE2 uses an engineered RT 
to increase editing efficiencies, while PE3 nicks the non-edited strand to 
induce its replacement and further increase editing efficiency, typically 
to 20-50% with 1-10% indel formation in human HEK293T cells. Prime 


'Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Department of Chemistry and Chemical Biology, Harvard University, 
Cambridge, MA, USA. “Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA. *e-mail: drliu@fas.harvard.edu 
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Fig. 1| Overview of prime editing and feasibility studies in vitro and in yeast 
cells.a, The 75,122 known pathogenic human genetic variants in ClinVar 
(accessed July, 2019), classified by type. b, A prime editing complex consists of 
aPE protein containing an RNA-guided DNA-nicking domain, suchas Cas9 
nickase, fused to an RT domain and complexed witha pegRNA. The PE-pegRNA 
complex enables a variety of precise DNA edits at a wide range of positions. 
spCas9, Streptococcus pyogenes Cas9. c, The PE-pegRNA complex binds the 
target DNA and nicks the PAM-containing strand. The resulting 3’ end 
hybridizes to the PBS, then primes reverse transcription of new DNA containing 
the desired edit using the RT template of the pegRNA. Equilibration between 
the edited 3’ flap and the unedited 5’ flap, cellular S’ flap cleavage and ligation, 
and DNA repair results in stably edited DNA. d, In vitro primer extension assays 
with 5’-extended pegRNAs, pre-nicked dsDNA substrates containing 5’-Cy5- 


editing offers much lower off-target activity than Cas9 at known Cas9 off- 
target loci, far fewer byproducts and higher or similar efficiency compared 
to Cas9-initiated HDR, and complementary strengths and weaknesses 
comparedto base editors. By enabling precise targeted insertions, dele- 
tions, andall 12 possible classes of point mutations without requiring DSBs 
or donor DNA templates, prime editing has the potential to advance the 
study and correction of the vast majority of pathogenic alleles. 


Prime editing strategy 


Cas9 targets DNA using a guide RNA containing a spacer sequence that 
hybridizes to the target DNA site? *”°”, We envisioned the generation of 
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labelled PAM strands, dCas9, andacommercial M-MLV RT variant (RT, 
Superscript III). dCas9 was complexed with pegRNAs, then added to DNA 
substrates along with the indicated components. After 1h, reactions were 
analysed by denaturing PAGE to visualize Cy5 fluorescence. e, Primer extension 
assays performed asin d using 3’-extended pegRNAs pre-complexed with 
dCas9 or Cas9(H840A) nickase, and pre-nicked or non-nicked dsDNA 
substrates. f, Yeast colonies transformed with GFP-mCherry fusion reporter 
plasmids edited in vitro with pegRNAs, Cas9 nickase, and RT. Plasmids 
containing nonsense or frameshift mutations between GFP and mCherry were 
edited with pegRNAs that restored mCherry translation via transversion, 1-bp 
insertion, or 1-bp deletion. GFP and mCherry double-positive cells (yellow) 
reflect successful editing. Images in d-fare representative of n=2 independent 
replicates. For gel source data, see Supplementary Fig. 1. 


guide RNAs that both specify the DNA target and contain new genetic 
information that replaces target DNA nucleotides. To transfer informa- 
tion from these engineered guide RNAs to target DNA, we proposed that 
genomic DNA, nicked at the target site to expose a 3’-hydroxyl group, 
could be used to prime the reverse transcription of an edit-encoding 
extension onthe engineered guide RNA (the pegRNA) directly into the 
target site (Fig. lb, c, Supplementary Discussion). 

These initial steps result ina branched intermediate with two redun- 
dant single-stranded DNA flaps: a 5S’ flap that contains the unedited DNA 
sequence and a3’ flap that contains the edited sequence copied from 
the pegRNA (Fig. Ic). Although hybridization of the perfectly comple- 
mentary S’ flap to the unedited strand is likely to be thermodynamically 
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Fig. 2| Prime editing of genomic DNA in human cells by PE1and PE2. a, Use of 
an engineered M-MLV reverse transcriptase (D200N, L603W, T306K, W313F, 
T330P) in PE2 substantially improves prime editing efficiencies at five genomic 
sites in HEK293T cells, and small insertion and small deletion edits at HEK3. 


favoured, 5’ flaps are the preferred substrate for structure-specific 
endonucleases such as FEN1”, which excises 5’ flaps generated during 
lagging-strand DNA synthesis and long-patch base excision repair. The 
redundant unedited DNA may also be removed by 5S’ exonucleases 
such as EXO1”’. 

We reasoned that preferential 5’ flap excision and 3’ flap ligation 
could drive the incorporation of the edited DNA strand, creating 
heteroduplex DNA containing one edited strand and one unedited 
strand (Fig. 1c). DNA repair to resolve the heteroduplex by copying the 
information in the edited strand to the complementary strand would 
permanently install the edit (Fig. Ic). On the basis ofa similar strategy 
we developed to favourably resolve heteroduplex DNA during base 
editing’* 8, we hypothesized that nicking the non-edited DNA strand 
might bias DNA repair to preferentially replace the non-edited strand. 


Validation in vitro and in yeast 


First, we tested whether the 3’ end of the protospacer-adjacent motif 
(PAM)-containing DNA strand cleaved by the RuvC nuclease domain 
of Cas9 was sufficiently accessible to prime reverse transcription. We 
designed pegRNAs by adding to single guide RNAs (sgRNAs) a primer 
binding site (PBS) that allows the 3’ end of the nicked DNA strand to 
hybridize to the pegRNA, and an RT template containing the desired 
edit (Fig. Ic). We constructed candidate pegRNAs by extending sgR- 
NAs on either end with a PBS sequence (5-6 nucleotides (nt)) and an 
RT template (7-22 nt), and confirmed that 5’-extended pegRNAs sup- 
port Cas9 binding to target DNA in vitro and that both S’-extended and 
3’-extended pegRNAs support Cas9-mediated DNA nicking in vitro 
and DNA cleavage in mammalian cells (Extended Data Fig. 1a—c). Next, 
we tested the compatibility of these candidate pegRNAs with reverse 
transcription using pre-nicked 5’-Cy5-labelled double-stranded DNA 
(dsDNA) substrates, catalytically dead Cas9 (dCas9), andacommercial 
Moloney murine leukaemia virus (M-MLYV) RT variant (Extended Data 
Fig. 1d). When all components were present, the labelled DNA strand 
was efficiently converted into longer DNA products with gel mobilities 
consistent with reverse transcription along the RT template (Fig. 1d, 
Extended Data Fig. 1d, e). Omission of dCas9 led to nick translation 
products that resulted from RT-mediated DNA polymerization onthe 
DNA template, with no pegRNA information transfer. No DNA polym- 
erization products were observed when the pegRNA was replaced by 


RT template length (nt) RT template length (nt) RT template length (nt) RT template length (nt) RT template length (nt) 


b, PE2 editing efficiencies with varying RT template lengths at five genomic 
sites in HEK293T cells. Editing efficiencies reflect sequencing reads that 
contain the intended edit and do not contain indels among all treated cells, 
with no sorting. Mean+s.d. ofn=3 independent biological replicates. 


aconventional sgRNA (Fig. 1d). These results demonstrate that nicked 
DNA exposed by dCas9 is competent to prime reverse transcription 
froma pegRNA. 

Next, we tested non-nicked dsDNA substrates with a Cas9(H840A) 
nickase that nicks the PAM-containing strand’. In these reactions, 
5’-extended pegRNAs generated reverse transcription products inef- 
ficiently (Extended Data Fig. 1f), but 3’-extended pegRNAs enabled 
efficient Cas9 nicking and reverse transcription (Fig. le). The use of 
3’-extended pegRNAs generated only asingle apparent product, despite 
the theoretical possibility that reverse transcription could terminate 
anywhere within the pegRNA. DNA sequencing of reactions with Cas9 
nickase, RT, and 3’-extended pegRNAs revealed that the complete RT 
template sequence was reverse transcribed into the DNA substrate 
(Extended Data Fig. 1g). These experiments establish that 3’-extended 
pegRNAs can direct Cas9 nickase and template reverse transcription 
in vitro. 

To evaluate the eukaryotic cell DNA repair outcomes of 3’ flaps 
produced by pegRNA-programmed reverse transcription in vitro, we 
performed in vitro prime editing on reporter plasmids, then trans- 
formed the reaction products into yeast cells (Extended Data Fig. 2). 
We constructed reporter plasmids encoding EGFP and mCherry sepa- 
rated by a linker containing an in-frame stop codon, +1 frameshift, or 
-1 frameshift. When plasmids were edited in vitro with Cas9 nickase, 
RT, and 3’-extended pegRNAs encoding a transversion that corrects 
the premature stop codon, 37% of yeast transformants expressed both 
GFP and mCherry (Fig. 1f, Extended Data Fig. 2). Reactions edited with 
5’-extended pegRNAs yielded fewer GFP and mCherry double-positive 
colonies (9%). Productive editing was also observed using 3’-extended 
pegRNAs that insert a single nucleotide (15%) or delete a single nucle- 
otide (29%) to correct frameshift mutations (Fig. 1f, Extended Data 
Fig. 2). These results demonstrate that DNA repair in eukaryotic cells 
can resolve 3’ DNA flaps from prime editing to incorporate precise 
transversions, insertions, and deletions. 


Prime editor1 

Encouraged by these observations, we sought to develop a prime edit- 
ing system with a minimum number of components that could edit 
genomic DNA in mammalian cells. We transfected HEK293T cells with 
one plasmid encoding a fusion of the wild-type M-MLV RT througha 
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Fig. 3 |PE3 and PE3b systems nick the non-edited strand to increase prime 
editing efficiency. a, Overview of prime editing by PE3. After initial synthesis 
of the edited strand, 5’ flap excision leaves behind a DNA heteroduplex 
containing one edited strand and one non-edited strand. Mismatch repair 
resolves the heteroduplex to give either edited or non-edited products. 
Nicking the non-edited strand favours repair of that strand, resulting in 
preferential generation of duplex DNA containing the desired edit. b, The 


flexible linker to either terminus of the Cas9(H840A) nickase, anda 
second plasmid encoding a pegRNA (Extended Data Fig. 3a). Initial 
attempts led to no detectable editing. 

Extension of the PBS in the pegRNA to 8-15 bases, however, led to 
detectable installation ofa transversion at the HEK293 site 3 (hereafter 
referred to as HEK3) target site, with higher efficiencies when the RT was 
fused to the C terminus of Cas9 nickase than when it was fused tothe N 
terminus (Extended Data Fig. 3b). These results suggest that wild-type 
M-MLV RT fused to Cas9 requires longer PBS sequences for genome 
editing in human cells compared to what is required in vitro using the 
commercial variant of M-MLV RT supplied in trans. We designated 
this M-MLV RT fused to the C terminus of Cas9(H840A) nickase as PE1. 

Wetested the ability of PE1 to introduce transversion point mutations 
at four additional genomic sites specified by the pegRNA (Fig. 2a). 
Editing efficiency at these sites was dependent on PBS length, with 
maximal editing efficiencies reaching 0.7-5.5% (Fig. 2a). Indels from PE1 
were minimal, averaging 0.2 + 0.1% (mean +s.d.) for the five sites under 
conditions that maximized each site’s editing efficiency (Extended 
Data Fig. 3a-f). PE1 also mediated targeted insertions and deletions 
with 4-17% efficiency at the HEK3 locus (Fig. 2a). These findings show 
that PE1 can directly install targeted transversions, insertions, and 
deletions without requiring DSBs or DNA templates. 


Prime editor 2 


We hypothesized that engineering the RT in PE1 might improve the 
efficiency of DNA synthesis during prime editing. M-MLV RT muta- 
tions that increase thermostability”, processivity™“, and DNA-RNA 
substrate affinity”, and that inactivate RNaseH activity”’, have been 
reported. We constructed 19 variants of PE1 containing a variety of RT 
mutations to evaluate their editing efficiency in human cells. 

First, we investigated M-MLVRT variants that support reverse tran- 
scription at elevated temperatures”. Introduction of DZOON, L603W 
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effect of complementary strand nicking on prime editing efficiency and indel 
formation. ‘None’ refers to PE2 controls, which do not nick the complementary 
strand. c, Comparison of editing efficiencies with PE2, PE3, and PE3b (edit- 
specific complementary strand nick). Editing efficiencies reflect sequencing 
reads that contain the intended edit and do not contain indels among all 
treated cells, withno sorting. Mean +s.d. of n=3 independent biological 
replicates. 


and T330P into M-MLV RT, hereafter referred to as M3, led to a 6.8- 
fold average increase in transversion and insertion editing efficiency 
across five genomic lociin HEK293T cells compared to PE1 (Extended 
Data Fig. 4). 

We tested additional RT mutations that have been shown to enhance 
binding tothe template-PBS complex, enzyme processivity, and ther- 
mostability”’. Among the 14 additional mutants analysed, addition of 
T306K and W313F to M3 improved editing efficiency an additional 
1.3-fold to 3.0-fold for six transversion or insertion edits across five 
genomic sites (Extended Data Fig. 4). This pentamutant RT incor- 
porated into PE1 (Cas9(H840A)-M-MLV RT(D200N/L603W/T330P/ 
T306K/W313F)) is hereafter referred to as prime editor 2 (PE2). 

PE2 installs single-nucleotide transversion, insertion, and dele- 
tion mutations with substantially higher efficiency than PE1, and is 
compatible with shorter PBS sequences, consistent with enhanced 
engagement of transient genomic DNA-PBS complexes (Fig. 2a). On 
average, PE2 led to a1.6- to 5.1-fold improvement in the efficiency of 
prime editing point mutations over PE1. PE2 also performed targeted 
insertions and deletions more efficiently than PE1 (Fig. 2a, Extended 
Data Fig. 4d). 


Optimization of pegRNAs 

We systematically probed the relationship between pegRNA structure 
and PE2 editing efficiency. Priming regions with lower G/C content 
generally required longer PBS sequences, consistent with the energetic 
requirements of hybridization of the nicked DNA strand to the pegRNA 
PBS (Fig. 2a). No PBS length or G/C content level was strictly predictive 
of editing efficiency, suggesting that other factors suchas DNA primer 
or RT template secondary structure also influence editing activity. 
We recommend starting with a PBS length of about 13 nt, and testing 
different PBS lengths during optimization, especially if the priming 
region deviates from about 40-60% G/C content. 
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Fig. 4 | Targeted insertions, deletions, and all 12 types of point mutation 
with PE3 at seven endogenous genomic lociin HEK293T cells. a, All12 types 
of single-nucleotide edit from position +1 to +8 of the HEK3 site using a10-nt RT 
template, counting the first nucleotide following the pegRNA-induced nick as 
position +1. b, Long-range PE3 edits at HEK3 using a34-nt RT template. c-e, PE3- 
mediated transition and transversion edits at the specified positions for RNF2 


Next, we systematically evaluated pegRNAs with RT templates 
10-20 nt long at five genomic target sites using PE2 (Fig. 2b), and with RT 
templates up to 31 nt at three genomic sites (Extended Data Fig. 5a—c). 
As with PBS length, RT template length could also be varied to maxi- 
mize prime editing efficiency, although many RT template lengths of 
ten or more nucleotides performed comparably. As some target sites 
preferred longer RT templates (more than 15 nt; FANCF, EMX1), whereas 
other loci preferred shorter RT templates (HEK3 and HEK293 site 4, 
hereafter referred to as HEK4) (Fig. 2b), we recommend starting with 
about 10-16 nt and testing shorter and longer RT templates during 
pegRNA optimization. 

Notably, the use of RT templates that place aC adjacent to the 3’ hair- 
pin ofthesgRNA scaffold generally resulted in lower editing efficiency 
(Extended Data Fig. 5a—c). We speculate that a Cas the first nucleotide 
of the 3’ extension can disrupt guide RNA structure by pairing with G81, 
which normally forms a pi stack with Y1356 in Cas9 and anon-canonical 
base pair with A68 of the sgRNA”’. Because many RT template lengths 
support prime editing, we recommend designing pegRNAs so that the 
first base of the 3’ extension is not C. 


Prime editor 3 systems 

The resolution of heteroduplex DNA from PE2 containing one edited 
and one non-edited strand determines long-term editing outcomes. 
To optimize base editing we previously used Cas9 nickase to nick the 
non-edited strand, directing DNA repair to that strand using the edited 
strand as atemplate’®“’. To apply this strategy to enhance prime edit- 
ing, we tested nicking the non-edited strand using the Cas9(H840A) 
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(c), RUNX1 (d), and VEGFA (e). f, Targeted 1- and 3-bp insertions, and 1-and3-bp 
deletions with PE3 at seven endogenous genomic loci. g, Targeted precise 
deletions of 5-80 bp at HEK3. h, Combination edits at three endogenous 
genomic loci. Editing efficiencies reflect sequencing reads that contain the 
intended edit and do not contain indels among all treated cells, with no sorting. 
Mean +s.d. ofn=3 independent biological replicates. 


nickase already present in PE2 and a simple sgRNA (Fig. 3a). As the 
edited DNA strand is also nicked to initiate prime editing, we tested a 
variety of nick locations on the non-edited strand to minimize DSBs 
that lead to indels. 

We first tested this strategy, designated PE3, at five genomic sitesin 
HEK293T cells using sgRNAs that induce nicks 14-116 nt away fromthe 
site of the pegRNA-induced nick. In four of the five sites tested, nicking 
the non-edited strand increased editing efficiency by 1.5- to 4.2-fold 
compared to PE2, to as high as 55% (Fig. 3b). Although the optimal 
nicking position varied depending on the genomic site (Supplementary 
Discussion), nicks positioned 3’ of the edit about 40-90 bp from the 
pegRNA-induced nick generally increased editing efficiency (averaging 
41%) without excess indel formation (6.8% average indels for the sgRNA 
with the highest editing efficiency) (Fig. 3b). We recommend starting 
with non-edited strand nicks about 50 bp from the pegRNA-mediated 
nick, and testing alternative nick locations if indel frequencies exceed 
acceptable levels. 

Nicking the non-edited strand only after resolution of the edited 
strand flap should minimize the presence of concurrent nicks, thereby 
minimizing formation of DSBs and indels. To achieve this goal, we 
designed sgRNAs with spacers that matched the edited strand, but 
not the original allele. Using this strategy, denoted PE3b, mismatches 
between the spacer and the unedited allele should disfavour sgRNA 
nicking until after editing of the PAM strand has taken place. PE3b 
resulted in a 13-fold decrease in the average number of indels (0.74%) 
compared to PE3, without any evident decrease in editing efficiency 
(Fig. 3c). When the edit lies within a second protospacer, we recom- 
mend the PE3b approach. 
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Together, these findings establish that PE3 systems improve editing 
efficiencies about threefold compared with PE2, albeit with a higher 
range of indels than PE2. When it is possible to nick the non-edited 
strand with an sgRNA that requires editing before nicking, the PE3b sys- 
tem offers PE3-like editing levels while greatly reducing indel formation. 

To demonstrate the targeting scope and versatility of prime editing 
with PE3, we performed all 24 possible single-nucleotide substitutions 
across the +1to +8 positions (counting the first base 3’ of the pegRNA- 
induced nick as position +1) of the HEK3 target site using PE3 and pegR- 
NAs with 10-nt RT templates (Fig. 4a). These 24 edits collectively cover 
all 12 possible transition and transversion mutations, and proceeded 
with average editing efficiencies (containing no indels) of 33 + 7.9%, 
with 7.5 + 1.8% average indels. 

Notably, long-distance RT templates can also give rise to efficient 
prime editing. Using PE3 with a 34-nt RT template, we installed point 
mutations at positions +12, +14, +17, +20, +23, +24, +26, +30, and +33in 
the HEK3 locus with 36 + 8.7% average efficiency and 8.6 + 2.0% indels 
(Fig. 4b). Other RT templates of 30 or more nucleotides at three other 
genomic sites also supported prime editing (Extended Data Fig. 5a-—c). 
Asan NGG PAM oneither DNA strand occurs on average every 8 bp, far 
less than edit-to-PAM distances that support efficient prime editing, 
prime editing is not substantially constrained by the availability ofa 
nearby PAM sequence, in contrast to other precision editing meth- 
ods""5/6, Given the presumed relationship between RNA secondary 
structure and prime editing efficiency, when designing pegRNAs for 
long-range edits we recommend testing RT templates of various lengths 
and, if necessary, sequence compositions (for example, using synony- 
mous codons). 

To further test the scope and limitations of PE3 for introducing point 
mutations, we tested 72 additional edits covering all possible types of 
point mutation across six additional genomic target sites (Fig. 4c-e, 
Extended Data Fig. 5d-f). Editing efficiency averaged 25 + 14%, while 
indel formation averaged 8.3 + 7.5%. Because the pegRNA RT template 
includes the PAM sequence, prime editing can induce changes in the 
PAM sequence. In these cases, we observed higher editing efficiency 
(averaging 39 + 9.7%) and lower indel generation (averaging 5.0 + 2.9%; 
Fig. 4, mutations at +5 or +6), potentially due to the inability of Cas9 
nickase to re-bind and nick the edited strand before the repair of the 
complementary strand. We recommend editing the PAM, in addition 
to other desired changes, whenever possible. 

Next, we performed 28 targeted small insertions and small deletions 
at seven genomic sites using PE3 (Fig. 4f). Targeted 1-bp and 3-bp inser- 
tions proceeded with an average efficiency of 32 + 9.8% and 39 + 16%, 
respectively. Targeted 1-bp and 3-bp deletions were also efficient, 
averaging 29 + 14% and 32 +11% editing, respectively. Indel generation 
(beyond the target insertion or deletion) averaged 6.8 + 5.4%. Because 
insertions and deletions between positions +1and +6 alter the location 
or structure of the PAM, we speculate that insertions or deletions at 
these positions are more efficient because they prevent re-engagement 
of the edited strand. 

We also tested PE3 for its ability to mediate larger precise deletions 
of 5-80 bp at the HEK3 site (Fig. 4g). We observed very high editing 
efficiencies (52-78%) for precise 5-, 10-, 15-, 25-, and 80-bp deletions, 
with indels averaging 11 + 4.8%. Finally, we tested the ability of PE3 to 
mediate 12 combinations of insertions, deletions, and/or point muta- 
tions across three genomic sites. These combination edits were also 
very efficient, averaging 55% editing with 6.4% indels (Fig. 4h). Together, 
the 156 distinct edits in Fig. 4 and Extended Data Fig. 5d-f establish the 
versatility, precision, and targeting flexibility of PE3 systems. 


Prime editing compared with base editing 


Cytidine base editors (CBEs) and adenine base editors (ABEs) can install 
transition mutations efficiently and with few indels’*“’. The applica- 
tion of base editing can be limited by unwanted bystander edits from 


154 | Nature | Vol576 | 5 December 2019 


the presence of multiple cytidine or adenine bases within the base 
editing activity window’ *”, or by the absence of a PAM positioned 
about 15 +2nt from the target nucleotide”. We anticipated that prime 
editing could complement base editing when bystander edits are unac- 
ceptable or when the target site lacks a suitably positioned PAM. 

We compared PEs and CBEs at three genomic loci that contain multi- 
ple target cytosines in the canonical base editing window (protospacer 
positions 4-8, counting the PAM as positions 21-23) using current-gen- 
eration CBEs” without or with nickase activity (BE2max and BE4max, 
respectively), or using analogous PE2 and PE3 prime editing systems. 
Among the nine total cytosines within the base editing windows of the 
three sites, BE4max yielded 2.2-fold higher average total C*G-to-T*A 
conversion than PE3 for bases in the centre of the base editing win- 
dow (protospacer positions 5-7, Extended Data Fig. 6a). However, PE3 
outperformed BE4max by 2.7-fold at cytosines positioned outside the 
centre of the base editing window. Overall, indel frequencies for PE2 
were very low (averaging 0.86 + 0.47%), and for PE3 were similar to or 
modestly higher than that of BE4max (PE3: 2.5-21%; BE4max: 2.5-14%) 
(Extended Data Fig. 6b). 

For the installation of precise edits (with no bystander editing), the 
efficiency of prime editing greatly exceeded that of base editing at 
the above sites, which, like most genomic DNA sites, contain multi- 
ple cytosines within the base editing window. BE4max generated few 
products containing only the single target base-pair conversion with 
no bystander edits. By contrast, prime editing at this site could be used 
to selectively install a C*G-to-TeA edit at any position or combination 
of positions (Extended Data Fig. 6c). 

We also compared nicking and non-nicking adenine base editors 
(ABEs) with PE3 and PE2, with similar results (Extended Data Fig. 6d-f, 
Supplementary Discussion). Collectively, these results indicate that 
base editing and prime editing offer complementary strengths and 
weaknesses for making targeted transition mutations. When a single 
target nucleotide is present within the base editing window, or when 
bystander edits are acceptable, current base editors are typically more 
efficient and generate fewer indels than prime editors. When multiple 
cytosines or adenines are present and bystander edits are undesirable, 
or when PAMs that position target nucleotides for base editing are not 
available, prime editors offer substantial advantages. 


Off-target prime editing 

Prime editing requires target DNA—pegRNA spacer complementarity 
for the Cas9 domain to bind, target DNA-pegRNA PBS complemen- 
tarity to initiate pegRNA-templated reverse transcription, and target 
DNA-RT product complementarity for flap resolution. To test whether 
these three distinct DNA hybridization steps reduce off-target prime 
editing compared to editing methods that require only target-guide 
RNA complementarity, we treated HEK293T cells with PE3 or PE2 and 
16 pegRNAs that target four genomic loci, each of which has at least 
four well-characterized Cas9 off-target sites’. We also treated cells 
with Cas9 nuclease and the same 16 pegRNAs, or with Cas9 and four 
sgRNAs targeting the same four protospacers (Supplementary Table 1). 

Consistent with previous studies”, Cas9 and sgRNAs targeting HEK3, 
HEK4, EMX1, and FANCF modified the top four known Cas9 off-target 
loci for each sgRNA with average frequencies of 16 + 16%, 60 + 26%, 
48 + 28%, and 4.3 + 5.6%, respectively (Extended Data Fig. 6g). Cas9 
with pegRNAs modified on-target sites with similar efficiency as Cas9 
with sgRNAs, whereas Cas9 with pegRNAs modified off-target sites at 
4.4-fold lower average efficiency than Cas9 with sgRNAs. 

Strikingly, PE3 or PE2 with the same 16 pegRNAs containing these 
four target spacers resulted in detectable off-target editing at only 3 
out of 16 off-target sites, with only 1 of 16 showing an off-target editing 
efficiency of 1% or more (Extended Data Fig. 6h). Average off-target 
prime editing for pegRNAs targeting HEK3, HEK4, EMX1, and FANCF 
at the top four known Cas9 off-target sites for each protospacer was 
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Fig. 5| Prime editing of pathogenic mutations, prime editing in primary 
mouse cortical neurons, and comparison of prime editing and HDRin four 
human cell lines. a, Installation (via TeA-to-A*T transversion) and correction 
(via A*T-to-TeA transversion) of the pathogenic E6V-coding mutation in HBBin 
HEK293T cells. Correction either to wild-type HBB, or to HBB containing a PAM- 
disrupting silent mutation, is shown. b, Installation (via 4-bp insertion) and 
correction (via 4-bp deletion) of the pathogenic HEXA”**”" allele in HEK293T 
cells. Correction either to wild-type HEXA, or to HEXA containing a PAM- 
disrupting silent mutation, is shown.c, Installation of the protective G127V- 
coding variant in PRNPin HEK293T cells via GeC-to-TeA transversion. d, 


<0.1%, <2.2+ 5.2%, <O.1%, and <0.13 + 0.11%, respectively (Extended Data 
Fig. 6h). Notably, at the HEK4 off-target 3 site that was edited by Cas9 
with pegRNA1at 97% efficiency, PE2 with pegRNA1 resulted in only 0.2% 
off-target editing despite sharing the same pegRNA, demonstrating 
how the two additional hybridization events required for prime edit- 
ing can greatly reduce off-target modification. Together, these results 
suggest that prime editing induces much lower off-target editing than 
Cas9 at known Cas9 off-target sites. 

Reverse transcription of 3’-extended pegRNAs in principle can 
proceed into the guide RNA scaffold, resulting in scaffold sequence 
insertion that contributes to indels at the target locus. We analysed 
66 PE3 editing experiments at four loci in HEK293T cells and observed 
1.7 +1.5% average total insertion of any number of pegRNA scaffold 
nucleotides (Extended Data Fig. 7). We speculate that inaccessibility 
of the guide RNA scaffold to reverse transcription due to Cas9 domain 
binding, and cellular excision of the mismatched 3’ end of 3’ flaps that 
extend into the pegRNA scaffold, minimize products that incorporate 
pegRNA scaffold nucleotides. 

The presence of endogenous human RTs from retroelements* and 
telomerase suggests that RT activity is not inherently toxic to human 
cells. Indeed, we observed no differences in the viability of HEK293T 
cells expressing dCas9, Cas9(H840A) nickase, PE2, or PE2 with R110S 
and K103L mutations (PE2-dRT) that inactivate the RT and abolish 
prime editing® (Extended Data Fig. 8a, b). To evaluate changes in the 
cellular transcriptome that result from prime editing, we performed 
RNA sequencing (RNA-seq) on HEK293T cells expressing PE2, PE2- 
dRT, or Cas9(H840A) nickase together with a PRNP-targeting or HEXA- 
targeting pegRNA (Extended Data Fig. 8c-k), and observed that active 
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PE2 minimally perturbed the transcriptome relative to Cas9 nickase or 
acontrol lacking active RT (Supplementary Discussion). 


Prime editing pathogenic mutations 

We tested the ability of PE3 to directly install or correct in human cells 
transversion, insertion, and deletion mutations that cause genetic 
diseases. Sickle cell disease is caused by a AeT-to-TeA transversion muta- 
tion in HBB, resulting in an E6V mutation in B-globin (Supplementary 
Discussion). We used PE3 to install this HBB mutation into HEK293T 
cells with 44% efficiency and 4.8% indels (Fig. 5a) and isolated from 
a single prime editing experiment six HEK293T cell lines that were 
homozygous (triploid) for the mutated HBB allele (Supplementary 
Note 1). To correct the mutant HBB allele to wild-type HBB, we treated 
HEK293T cells homozygous for mutant HBB with PE3 and a pegRNA 
programmed to directly revert the HBB mutation to wild-type HBB. 
All14 tested pegRNAs mediated efficient correction of mutant HBBto 
wild-type HBB (26-52% efficiency), and indel levels averaged 2.8 + 0.70% 
(Extended Data Fig. 9a). Introduction of aPAM-modifying silent muta- 
tion improved editing efficiency and product purity to 58% correction 
with 1.4% indels (Fig. 5a). 

The most common mutation that causes Tay-Sachs disease is a4-bp 
insertion in HEXA (HEXA”’**™'°), We used PE3 to install this 4-bp inser- 
tion into HEXA with 31% efficiency and 0.8% indels (Fig. 5b), and iso- 
lated two HEK293T cell lines that were homozygous for HEXA?”**™7¢ 
(Supplementary Note 1). We used these cells to test 43 pegRNAs and 
three nicking sgRNAs with PE3 or PE3b systems for correction of the 
pathogenic insertion in HEXA (Extended Data Fig. 9b). Nineteen of the 
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43 pegRNAs tested resulted in editing with an efficiency of 20% or more. 
Correction to wild-type HEXA with the best pegRNA proceeded with 33% 
efficiency and 0.32% indels using PE3b (Fig. 5b, Extended Data Fig. 9b). 
Finally, we used PE3 to install a protective GeC-to-TeA transversion 
into PRNP (resulting in PRNP(G127V)) into HEK293T cells, introducing 
amutantallele that confers resistance to prion disease in humans” and 
mice” (Supplementary Discussion). We evaluated four pegRNAs and 
three nicking sgRNAs. The most effective pegRNA with PE3 resulted 
in 53% installation of G127V, with 1.7% indels (Fig. 5c). Together, these 
results establish the ability of prime editing in human cells to install 
or correct transversion, insertion, or deletion mutations that cause 
or confer resistance to disease efficiently, and with few byproducts. 


Other cell lines and primary neurons 


Next, we tested prime editing at endogenous sites in three additional 
human cell lines (Extended Data Fig. 10a, Supplementary Discussion). 
InK562 cells, PE3 achieved three transversion edits and a His, tag inser- 
tion with 15-30% editing efficiency and 0.85-2.2% indels (Extended 
Data Fig. 10a). In U2OS cells, we installed transversion mutations, as 
well as a 3-bp insertion and His, tag insertion, with 7.9-22% editing 
efficiency and 0.13-2.2% indels (Extended Data Fig. 10a). Finally, in HeLa 
cells we performed a 3-bp insertion with 12% average efficiency and 
1.3% indels (Extended Data Fig. 10a). Collectively, these data indicate 
that cell lines other than HEK293T support prime editing, although 
editing efficiencies vary by cell type and are generally less efficient 
than in HEK293T cells. Editing:indel ratios remained favourable in all 
human cell lines tested. 

To determine whether prime editing is possible in post-mitotic, ter- 
minally differentiated primary cells, we transduced primary cortical 
neurons from E18.5 mice with a PE3 lentiviral delivery system in which 
PE2 protein components were expressed from the neuron-specific 
synapsin promoter’ along with a GFP marker (see Methods). Nuclei 
were isolated two weeks after transduction and sequenced directly, 
or sorted for GFP expression before sequencing. We observed 7.1% 
average prime editing of DNMT1 with 0.58% average indels in sorted 
cortical neuron nuclei (Fig. 5d). Cas9 nuclease in the same lentivirus 
system resulted in 31% average indels among sorted nuclei (Fig. 5d). 
These data indicate that post-mitotic, terminally differentiated primary 
cells can support prime editing. 


Prime editing compared with HDR 
Finally, we compared the performance of PE3 with that of optimized 
Cas9-initiated HDR” in mitotic cell lines that support HDR“. We 
treated HEK293T, HeLa, K562 and U20S cells with Cas9 nuclease, an 
sgRNA, and a single-stranded DNA (ssDNA) donor template designed to 
install a variety of transversion and insertion edits (Fig. 5e, f, Extended 
Data Fig. 10). Cas9-initiated HDR in all cases successfully installed the 
desired edit, but with far higher levels of indel byproducts than with PE3, 
as expected given that Cas9 induces DSBs. In HEK293T cells, the ratio 
of editing to indels for installation or correction of the allele encoding 
HBB(E6V) or installation of the allele encoding PRNP(G127V) was on 
average 270-fold higher for PE3 than for Cas9-initiated HDR. 
Comparisons between PE3 and HDR in human cell lines other than 
HEK293T showed similar results, although with lower PE3 editing effi- 
ciencies (Fig. 5e, f, supplementary Discussion). Collectively, these data 
indicate that HDR typically results in similar or lower editing efficiencies 
than PE3 with far more indels in four tested human cell lines (Extended 
Data Fig. 10). 


Discussion and future directions 


The ability to insert arbitrary DNA sequences with single-nucleotide 
precision is an especially promising capability of prime editing. For 
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example, we used PE3 in HEK293T cells to precisely insert into HEK3a 
His, tag (18 bp, 65% efficiency), a Flag epitope tag (24 bp, 18% efficiency), 
and an extended Cre recombinase loxP site (44 bp, 23% efficiency) with 
3.0-5.9% indels (Fig. 5g). We anticipate that the ability to efficiently and 
precisely insert new DNA sequences into target sites in living cells will 
enable many biotechnological and therapeutic applications. 

Collectively, the prime editing experiments described here performed 
19 insertions up to 44 bp, 23 deletions up to 80 bp, 119 point mutations 
including 83 transversions, and 18 combination edits at 12 endogenous 
loci in the human and mouse genomes at locations ranging from 3 bp 
upstream to 29 bp downstream of a PAM without making explicit DSBs. 
These results establish prime editing as a remarkably versatile genome 
editing method. Because 85-99% of insertions, deletions, indels, and 
duplications in ClinVar are 30 bp in length or smaller (Extended Data 
Fig. 11), in principle prime editing could correct up to about 89% of the 
75,122 pathogenic human genetic variants in ClinVar (Fig. 1a). 

Prime editing offers many possible choices of pegRNA-induced 
nick locations, sgRNA-induced second nick locations, PBS lengths, RT 
template lengths, and which strand to edit first. This flexibility, which 
contrasts with more limited options typically available for other preci- 
sion editing methods”>”*, allows editing efficiency, product purity, 
DNA specificity, and other parameters to be optimized to suit a given 
application (Extended Data Fig. 9). 

Much additional research is needed to further understand and 
improve prime editing in a broad range of cell types and organisms, 
to assess off-target prime editing in a genome-wide manner, and to 
further characterize the extent to which prime editors might affect 
cells. Interfacing prime editing with additional in vitro and in vivo 
delivery strategies is essential for exploring the potential of prime 
editing to enable applications, including the study and treatment of 
genetic diseases. By enabling precise targeted transitions, transver- 
sions, insertions, and deletions in the genomes of mammalian cells 
without requiring DSBs, donor DNA templates, or HDR, however, prime 
editing provides anew search-and-replace capability that substantially 
expands the scope of genome editing. 
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Methods 


General methods 

DNA amplification was conducted by PCR using Phusion U Green Mul- 
tiplex PCR Master Mix (ThermoFisher Scientific) or Q5 Hot Start High- 
Fidelity 2x Master Mix (New England BioLabs) unless otherwise noted. 
DNA oligonucleotides, including Cy5-labelled DNA oligonucleotides, 
dCas9 protein, and Cas9(H840A) protein were obtained from Integrated 
DNA Technologies. Yeast reporter plasmids were derived from previ- 
ously described plasmids” and cloned by the Gibson assembly method. 
All mammalian editor plasmids used in this work were assembled using 
the USER cloning methodas previously described*®. Plasmids expressing 
sgRNAs were constructed by ligation of annealed oligonucleotides into 
BsmBI-digested acceptor vector (Addgene plasmid no. 65777). Plasmids 
expressing pegRNASs were constructed by Gibson assembly or Golden 
Gate assembly using a custom acceptor plasmid (see Supplementary 
Note 3). Sequences of sgRNA and pegRNA constructs used in this work 
are listed in Supplementary Tables 2 and 3. All vectors for mammalian 
cell experiments were purified using Plasmid Plus Midiprep kits (Qiagen) 
or PureYield plasmid miniprep kits (Promega), which include endotoxin 
removal steps. All experiments using live animals were approved by 
the Broad Institute Institutional and Animal Care and Use Committees. 
Wild-type C57BL/6 mice were obtained from Charles River (#027). No 
statistical methods were used to predetermine sample size. The experi- 
ments were not randomized, and investigators were not blinded to 
allocation during experiments and outcome assessment. 


In vitro biochemical assays 

pegRNAs and sgRNAs were transcribed in vitro using the HiScribe T7 
in vitro transcription kit (New England Biolabs) from PCR-amplified 
templates containing a T7 promoter sequence. RNA was purified by 
denaturing urea PAGE and quality-confirmed by an analytical gel before 
use. 5’-Cy5-labelled DNA duplex substrates were annealed using two 
oligonucleotides (Cy5-AVA024 and AVAO25; 1:1.1 ratio) for the non- 
nicked substrate or three oligonucleotides (Cy5-AVA023, AVAO25 and 
AVAO26; 1:1.1:1.1) for the pre-nicked substrate by heating to 95 °C for 
3 min followed by slowly cooling to room temperature (Supplementary 
Table 2). Cas9 cleavage and reverse transcription reactions were carried 
out in1x cleavage buffer“ supplemented with dNTPs (20 mMHEPES-K, 
pH 7.5; 100 mM KCI; 5% glycerol; 0.2 mM EDTA, pH 8.0;3 mM MgCl; 
0.5 mM dNTP mix; 5 mM DTT). dCas9 or Cas9(H840A) (5 uM final) and 
thesgRNA or pegRNA (5 uM final) were pre-incubated at room tempera- 
tureina5-pl reaction mixture for 10 min before the addition of 0.5 pl of 
4 uM duplex DNA substrate (400 nM final), followed by the addition of 
0.2 pl of Superscript III reverse transcriptase (ThermoFisher Scientific), 
an undisclosed M-MLV RT variant, when applicable. Reactions were 
carried out at 37 °C for 1h, then diluted to a volume of 10 ul with water, 
treated with 0.2 pl of proteinase K solution (20 mg/ml, ThermoFisher 
Scientific), and incubated at room temperature for 30 min. Following 
heat inactivation at 95 °C for 10 min, reaction products were combined 
with 2x formamide gel loading buffer (90% formamide; 10% glycerol; 
0.01% bromophenol blue), denatured at 95 °C for 5 min, and separated 
by denaturing urea PAGE gel (15% TBE-urea, 55 °C, 200 V). DNA products 
were visualized by Cy5 fluorescence signal using a Typhoon FLA 7000 
biomolecular imager. 

Electrophoretic mobility shift assays were carried out in 1x binding 
buffer (1x cleavage buffer with 10 pg/ml heparin) using pre-incubated 
dCas9-sgRNA or dCas9-pegRNA complexes (concentration between 
5nM and 1uM final) and Cy5-labelled duplex DNA (Cy5-AVA024 and 
AVAO25; 20 nM final). After 15 min of incubation at 37 °C, the samples were 
analysed by native PAGE gel (10% TBE) and imaged for Cy5 fluorescence. 

For DNA sequencing of reverse transcription products, fluorescent 
bands were excised and purified from urea PAGE gels, then 3’ tailed 
with terminal transferase (TdT; New England Biolabs) in the presence 
of dGTP or dATP according to the manufacturer’s protocol. Tailed DNA 


products were diluted tenfold with binding buffer (40% saturated 
aqueous guanidinium chloride and 60% isopropanol) and purified 
by QlAquick spin column (Qiagen), then used as templates for primer 
extension by Klenow fragment (New England Biolabs) using primer 
AVA134 (A-tailed products) or AVA135 (G-tailed products) (Supple- 
mentary Table 2). Extensions were amplified by PCR for 10 cycles using 
primers AVA110 and AVA122, then sequenced with AVAQ37 using the 
Sanger method (Supplementary Table 2). 


Yeast fluorescent reporter assays 

Dual fluorescent reporter plasmids containing an in-frame stop codon, 
atl frameshift, or a—1 frameshift were subjected to 5’-extended pegRNA 
or 3’-extended pegRNA prime editing reactions in vitro as described 
above using 100 ng of plasmid substrate. Following incubation at 37 °C 
for 1h, the reactions were diluted with water and plasmid DNA was 
precipitated with 0.3 M sodium acetate and 70% ethanol. Resuspended 
DNA wastransformed into Saccharomyces cerevisiae by electroporation 
as previously described” and plated on synthetic complete medium 
without leucine (SC(glucose), L-). GFP and mCherry fluorescence 
signals were visualized from colonies with the Typhoon FLA 7000 
biomolecular imager. 


General mammalian cell culture conditions 

HEK293T (ATCC CRL-3216), U2OS (ATTC HTB-96), K562 (CCL-243), and 
HeLa (CCL-2) cells were purchased from ATCC and cultured and pas- 
saged in Dulbecco’s modified Eagle’s medium (DMEM) plus GlutaMAX 
(ThermoFisher Scientific), McCoy’s 5A medium (Gibco), RPMI medium 
1640 plus GlutaMAX (Gibco), or Eagle’s minimal essential medium 
(EMEM, ATCC), respectively, each supplemented with 10% (v/v) fetal 
bovine serum (Gibco, qualified) and 1x penicillin streptomycin (Corn- 
ing). All cell types were incubated, maintained, and cultured at 37 °C 
with 5% CO.,. Cell lines were authenticated by their respective suppliers 
and tested negative for mycoplasma. 


HEK293T tissue culture transfection protocol and genomic DNA 
preparation 

HEK293T cells were seeded on 48-well poly-D-lysine coated plates 
(Corning). Between 16 and 24 h after seeding, cells were transfected 
at approximately 60% confluency with 1 ul lipofectamine 2000 (Thermo 
Fisher Scientific) according to the manufacturer’s protocols and 750 ng 
PE plasmid, 250 ng pegRNA plasmid, and 83 ng sgRNA plasmid (for 
PE3 and PE3b). Unless otherwise stated, cells were cultured for 3 days 
following transfection, after which the medium was removed, the 
cells were washed with 1x PBS solution (Thermo Fisher Scientific), 
and genomic DNA was extracted by the addition of 150 pl of freshly 
prepared lysis buffer (10 mM Tris-HCl, pH 7.5; 0.05% SDS; 25 pg/ml 
proteinase K (ThermoFisher Scientific)) directly into each well of the 
tissue culture plate. The genomic DNA mixture was incubated at 37 °C 
for 1-2 h, followed by an 80 °C enzyme inactivation step for 30 min. 
Primers used for mammalian cell genomic DNA amplification are listed 
in Supplementary Table 4. For HDR experiments in HEK293T cells, 
231 ng Cas9 nuclease-expression plasmid, 69 ng sgRNA-expression 
plasmid and 50 ng (1.51 pmol) of 100-nt ssDNA donor template (PAGE- 
purified; Integrated DNA Technologies) was lipofected using 1.4 pl 
lipofectamine 2000 (ThermoFisher) per well. Genomic DNA from all 
HDR experiments was purified using the Agencourt DNAdvance Kit 
(Beckman Coulter), according to the manufacturer’s protocol. 


High-throughput DNA sequencing of genomic DNA samples 

Genomic sites of interest were amplified from genomic DNA samples 
and sequenced on an Illumina MiSeq as previously described with the 
following modifications””’. In brief, amplification primers containing 
Illumina forward and reverse adapters (Supplementary Table 4) were 
used for a first round of PCR (PCR 1) to amplify the genomic region of 
interest. PCR 1 reactions (25 pl) were performed with 0.5 uM of each 


forward and reverse primer, 1 pl genomic DNA extract and 12.5 pil Phu- 
sion U Green Multiplex PCR Master Mix. PCR reactions were carried out 
as follows: 98 °C for 2 min, then 30 cycles of [98 °C for 10 s, 61°C for 
20s, and 72 °C for 30 s], followed by a final 72 °C extension for 2 min. 
Unique Illumina barcoding primer pairs were added to each samplein 
a secondary PCR reaction (PCR 2). Specifically, 25 pl of a given PCR 2 
reaction contained 0.5 uM of each unique forward and reverse Illumina 
barcoding primer pair, 1 pl unpurified PCR1 reaction mixture, and 12.5 pl 
Phusion U Green Multiplex PCR 2x Master Mix. The barcoding PCR 2 
reactions were carried out as follows: 98 °C for 2 min, then 12 cycles 
of [98 °C for 10s, 61°C for 20 s, and 72 °C for 30 s], followed by a final 
72 °C extension for 2 min. PCR products were evaluated analytically by 
electrophoresis in a1.5% agarose gel. PCR 2 products (pooled by com- 
mon amplicons) were purified by electrophoresis with a 1.5% agarose 
gel using a QlAquick Gel Extraction Kit (Qiagen), eluting with 40 pl 
water. DNA concentration was measured by fluorometric quantification 
(Qubit, ThermoFisher Scientific) or qPCR (KAPA Library Quantification 
Kit-IIlumina, KAPA Biosystems) and sequenced on an Illumina MiSeq 
instrument according to the manufacturer’s protocols. 

Sequencing reads were demultiplexed using MiSeq Reporter (Illu- 
mina). Alignment of amplicon sequences to a reference sequence was 
performed using CRISPResso2™. For all prime editing yield quantifica- 
tion, prime editing efficiency was calculated as: percentage of (number 
of reads with the desired edit that do not contain indels)/(number of 
total reads). For quantification of point mutation editing, CRISPResso2 
was run in standard mode with “discard_indel_reads” on. Prime edit- 
ing for installation of point mutations was then explicitly calculated 
as: (frequency of specified point mutation in non-discarded reads) x 
(number of non-discarded reads)/(total reads). For insertion or dele- 
tion edits, CRISPResso2 was runin HDR mode using the desired allele 
as the expected allele (e flag), and with “discard_indel_reads” on. Editing 
yield was calculated as: (number of HDR-aligned reads)/(total reads). 
For all experiments, indel yields were calculated as: (number of indel- 
containing reads)/(total reads). 


Nucleofection of U20S, K562, and HeLa cells 

Nucleofection was used for transfection in all experiments using K562, 
HeLa, and U20S cells. For PE conditions in these cell types, 800 ng 
prime editor expression plasmid, 200 ng pegRNA expression plasmid, 
and 83 ng nicking sgRNA expression plasmid was nucleofected ina 
final volume of 20 pl in a16-well nucleocuvette strip (Lonza). For HDR 
conditions in these three cell types, 350 ng Cas9 nuclease expression 
plasmid, 150 ng sgRNA expression plasmid and 200 pmol (6.6 pg) 100- 
nt ssDNA donor template (PAGE-purified; Integrated DNA Technolo- 
gies) was nucleofected ina final volume of 20 pl per sample ina16-well 
Nucleocuvette strip (Lonza). K562 cells were nucleofected using the 
SF Cell Line 4D-Nucleofector X Kit (Lonza) with 5 x 10° cells per sample 
(program FF-120), according to the manufacturer’s protocol. U20S 
cells were nucleofected using the SE Cell Line 4D-Nucleofector X Kit 
(Lonza) with 3-4 x 10° cells per sample (program DN-100), according 
tothe manufacturer’s protocol. HeLa cells were nucleofected using the 
SE Cell Line 4D-Nucleofector X Kit (Lonza) with 2 x 10° cells per sample 
(program CN-114), according to the manufacturer’s protocol. Cells 
were harvested 72 h after nucleofection for genomic DNA extraction. 


Genomic DNA extraction for HDR experiments 

Genomic DNA from all HDR comparison experiments in HEK293T, 
HEK293T HBB(E6V), K562, U20S, and HeLa cells was purified using 
the Agencourt DNAdvance Kit (Beckman Coulter), according to the 
manufacturer’s protocol. 


Comparison between PE2, PE3, BE2, BE4max, ABEdmax, and 
ABEmax 

HEK293T cells were seeded on 48-well poly-D-lysine coated plates 
(Corning). After 16-24 h, cells were transfected at approximately 60% 


confluency. For base editing with CBE or ABE constructs, cells were 
transfected with 750 ng base editor plasmid, 250 ng sgRNA expression 
plasmid, and 1 pl of lipofectamine 2000 (Thermo Fisher Scientific). 
PE transfections were performed as described above. Genomic DNA 
extraction for PE and BE was performed as described above. 


Determination of PE3 activity at known Cas9 off-target sites 

To evaluate PE3 off-target editing activity at known Cas9 off-target sites, 
genomic DNA extracted from HEK293T cells 3 days after transfection 
with PE3 was used as template for PCR amplification of 16 previously 
reported Cas9 off-target genomic sites” (the top four off-target sites 
each for the HEK3, EMX1, FANCF, and HEK4 spacers; primer sequences 
are listed in Supplementary Table 4). These genomic DNA samples were 
identical to those used for quantifying on-target PE3 editing activi- 
ties shown in Fig. 4 or Extended Data Fig. 5d, e; pegRNA and nicking 
sgRNA sequences are listed in Supplementary Table 3. Following PCR 
amplification of off-target sites, amplicons were sequenced on the 
Illumina MiSeq platform as described above (see ‘High-throughput 
DNA sequencing of genomic DNA samples’ section). To determine the 
on-target and off-target editing activity of Cas9 nuclease, Cas9(H840A) 
nickase, dCas9, and PE2-dRT, we transfected HEK293T cells with 750 ng 
editor plasmid (Cas9 nuclease, Cas9(H840A) nickase, dCas9, or PE2- 
dRT), 250 ng pegRNA or sgRNA plasmid, and 1 pl lipofectamine 2000. 
Genomic DNA was isolated from cells 3 days after transfection as 
described above. On-target and off-target genomic loci were ampli- 
fied by PCR using the primer sequences in Supplementary Table 4 and 
sequenced on an Illumina MiSeq. 

High-throughput sequencing (HTS) data analysis was performed 
using CRISPResso2™. The editing efficiencies of Cas9 nuclease, Cas9 
H840A nickase, and dCas9 were quantified as the percentage of total 
sequencing reads containing indels. For quantification of PE3 and 
PE3-dRT off-targets, aligned sequencing reads were examined for 
point mutations, insertions, or deletions that were consistent with 
the anticipated product of pegRNA reverse transcription initiated at 
the Cas9 nick site. Single nucleotide variations occurring at <0.1% over- 
all frequency among total reads within a sample were excluded from 
analysis. For reads containing single nucleotide variations that both 
occurred at frequencies >0.1% and were partially consistent with the 
pegRNA-encoded edit, t-tests (unpaired, one-tailed, a=0.5) were used 
to determine whether the variants occurred at significantly higher lev- 
els compared to samples treated with pegRNAs that contained the same 
spacer but encoded different edits. To avoid differences in sequencing 
errors, comparisons were made between samples that were sequenced 
simultaneously within the same MiSeq run. Variants that did not meet 
the criteria of P> 0.05 were excluded. Off-target PE3 editing activity 
was then calculated as the percentage of total sequencing reads that 
met the above criteria. 


Generation of a HEK293T cell line containing HBB(E6V) using 
Cas9-initiated HDR 

HEK293T cells were seeded in a 48-well plate and transfected at 
approximately 60% confluency with 1.5 pl lipofectamine 2000, 300 ng 
Cas9(D10A) nickase plasmid, 100 ng sgRNA plasmid, and 200 ng 
100-mer ssDNA donor template (Supplementary Table 5). Three days 
after transfection, the medium was exchanged for fresh medium. Four 
days after transfection, cells were dissociated using 30 pl TrypLE solu- 
tion and suspended in 1.5 ml medium. Single cells were isolated into 
individual wells of two 96-well plates by fluorescence-activated cell 
sorting (FACS) (Beckman-Coulter Astrios). See Supplementary Note 
1 for representative FACS sorting examples. Cells were expanded for 
14 days before genomic DNA sequencing as described above. Of the 
isolated clonal populations, none was found to be homozygous for the 
HBBallele encoding the E6V mutation, soa second round of editing by 
lipofection, sorting, and outgrowth was repeated ina partially edited 
cell line to yield a cell line homozygous for the E6V-encoding allele. 
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Generation of a HEK293T cell line containing HBB(E6V) using PE3 
HEK293T cells (2.5 x 10*) were seeded on 48-well poly-D-lysine coated 
plates (Corning). Between 16 and 24 h after seeding, cells were trans- 
fected at approximately 70% confluency with 1 pl lipofectamine 2000 
(Thermo Fisher Scientific) according to the manufacturer’s protocols 
and 750 ng PE2-P2A-GFP plasmid, 250 ng pegRNA plasmid, and 83 ng 
sgRNA plasmid. After 3 days, cells were washed with 1x PBS (Gibco) and 
dissociated using TrypLE Express (Gibco). Cells were then diluted with 
DMEM plus GlutaMax (Thermo Fisher Scientific) supplemented with 
10% (v/v) FBS (Gibco) and passed through a 35-pm cell strainer (Corn- 
ing) before sorting. Flow cytometry was carried out onaLE-MA900 cell 
sorter (Sony). Cells were treated with 3 nM DAPI (BioLegend) 15 min 
before sorting. After gating for doublet exclusion, single DAPI-negative 
cells with GFP fluorescence above that of a GFP-negative control cell 
population were sorted into 96-well flat-bottom cell culture plates 
(Corning) filled with pre-chilled DMEM with GlutaMax supplemented 
with 10% FBS. See Supplementary Note 1 for representative FACS sort- 
ing examples and allele tables. Cells were cultured for 10 days before 
genomic DNA extraction and characterization by HTS, as described 
above. A total of six clonal cell lines were identified that are homozygous 
for the E6V-encoding mutation in HBB. 


Generation of a HEK293T cell line containing the HEXA?”""7¢ 
insertion using PE3 

HEK293T cells containing the HEXA”*™" allele were generated fol- 
lowing the protocol described above for creation of the HBB(E6V) cell 
line; pegRNA and sgRNA sequences are listed in Supplementary Table3 
under the Fig. 5 subheading. After transfection and sorting, cells were 
cultured for 10 days before genomic DNA was extracted and character- 
ized by HTS, as described above. We recovered two heterozygous cell 
lines that contained 50% HEXA”?’*"C alleles and two homozygous cell 
lines containing 100% HEXA”*™" alleles. 


Cell viability assays 

HEK293T cells were seeded in 48-well plates and transfected at approxi- 
mately 70% confluency with 750 ng editor plasmid (PE2, PE2(R110S/ 
K103L), Cas9(H840A) nickase, or dCas9), 250 ng HEK3-targeting 
pegRNA plasmid, and 1 pl lipofectamine 2000, as described above. 
Cell viability was measured every 24 h post-transfection for 3 days using 
the CellTiter-Glo 2.0 assay (Promega) according to the manufacturer’s 
protocol. Luminescence was measured in 96-well flat-bottomed poly- 
styrene microplates (Corning) using a M1000 Pro microplate reader 
(Tecan) with a 1-s integration time. 


Lentivirus production 

Lentivirus was produced as previously described“. T-75 flasks of rapidly 
dividing HEK293T cells (ATCC; Manassas, VA, USA) were transfected 
with lentivirus production helper plasmids pVSV-G and psPAX2 incom- 
bination with modified lentiCRISPRv2 genomes carrying intein-split 
PE2 editor using FUGENE HD (Promega, Madison, WI, USA) according 
to the manufacturer's protocol. Four split-intein editor constructs 
were designed: 1) a viral genome encoding a U6-pegRNA expression 
cassette and the N-terminal portion (1-573) of Cas9(H840A) nickase 
fused to the Npu N-intein, a self-cleaving P2A peptide, and GFP-KASH; 
2) a viral genome encoding the Npu C-intein fused to the C-terminal 
remainder of PE2; 3) a viral genome encoding the Npu C-intein fused 
to the C-terminal remainder of Cas9 for the Cas9 control; and 4) anick- 
ing sgRNA for DNMT1 (derived from Addgene plasmid no. 52963). The 
split-intein* mediates trans splicing to join the two halves of PE2 or 
Cas9, while the P2A GFP-KASH enables co-translational production ofa 
nuclear membrane-localized GFP. After 48 h, supernatant was collected, 
centrifuged at 500g for 5 min to remove cellular debris, and filtered 
using a0.45-pm filter. Filtered supernatant was concentrated using the 
PEG-it Virus Precipitation Solution (System Biosciences, Palo Alto, CA, 
USA) according to the manufacturer’s directions. The resulting pellet 


was resuspended in Opti-MEM (Thermo Fisher Scientific, Waltham, 
MA, USA) using 1% of the original medium volume. Resuspended pellet 
was flash-frozen and stored at —80 °C until use. 


Mouse primary cortical neuron dissection and culture 

E18.5 dissociated cortical cultures were taken from timed-pregnant 
C57BL/6 mice (Charles River). Embryos were removed from pregnant 
mice after euthanasia by CO, followed by decapitation. Cortical caps 
were dissected in ice-cold Hibernate-E supplemented with penicil- 
lin/streptomycin (Life Technologies). Following a rinse with ice-cold 
Hibernate-E, tissue was digested at 37 °C for 8 min in papain/DNase 
(Worthington/Sigma). Tissue was triturated in NBActiv4 (BrainBits) sup- 
plemented with DNase. Cells were counted and plated in 24-well plates at 
100,000 cells per well. Half of the medium was changed twice per week. 


Prime editing in primary neurons and nucleus isolation 

At days in vitro (DIV) 1, 15 pl lentivirus was added at a 10:10:1 ratio of 
N-terminal:C-terminal:nicking sgRNA. At DIV 14, neuronal nuclei were 
isolated using the EZ-PREP buffer (Sigma D8938) following the manu- 
facturer’s protocol. All steps were performed onice or at 4 °C. Medium 
was removed from dissociated cultures, and cultures were washed 
with ice-cold PBS. PBS was aspirated and replaced with 200 pl EZ-PREP 
solution. Following a 5-min incubation on ice, EZ-PREP was pipetted 
across the surface of the well to dislodge remaining cells. The sam- 
ple was centrifuged at 500g for 5 min, and the supernatant removed. 
Samples were washed with 200 pl EZ-PREP and centrifuged again at 
500g for 5 min. Samples were resuspended with gentle pipetting in 
200 wl ice-cold Nuclei Suspension Buffer (NSB) consisting of 100 pg/ 
ml BSA and 3.33 uM Vybrant DyeCycle Ruby (Thermo Fisher) in 1xPBS, 
then centrifuged at S00g for 5 min. The supernatant was removed and 
nuclei were resuspended in 100 pl NSB and sorted into 100 pl Agencourt 
DNAdvance lysis buffer using a MoFlo Astrios (Beckman Coulter) at 
the Broad Institute flow cytometry facility. Genomic DNA was purified 
according to the manufacturer’s Agencourt DNAdvance instructions. 


RNA-seq and data analysis 

HEK293T cells were co-transfected with PRNP-targeting or HEXA-target- 
ing pegRNAs and PE2, PE2-dRT, or Cas9(H840A) nickase. Seventy-two 
hours after transfection, total RNA was harvested from cells using TRI- 
zol reagent (Thermo Fisher) and purified with RNeasy Mini kit (Qiagen) 
including on-column DNasel treatment. Ribosomes were depleted from 
total RNA using the rRNA removal protocol of the TruSeq Stranded 
Total RNA library prep kit (Illumina) and subsequently washed with 
RNAClean XP beads (Beckman Coulter). Sequencing libraries were 
prepared using ribo-depleted RNA on a SMARTer PrepX Apollo NGS 
library prep system (Takara) following the manufacturer’s protocol. 
The resulting libraries were visualized on a 2200 TapeStation (Agilent 
Technologies), normalized using a Qubit dsDNA HS assay (Thermo 
Fisher), and sequenced on a NextSeq 550 using high output v2 flow 
cell (Illumina) as 75-bp paired-end reads. Fastq files were generated 
with bcl2fastq2 version 2.20 and trimmed using TrimGalore version 
0.6.2 (https://github.com/FelixKrueger/TrimGalore) to remove low- 
quality bases, unpaired sequences, and adaptor sequences. Trimmed 
reads were aligned to a Homo sapiens genome assembly GRCh38 witha 
custom Cas9(H840A) gene entry using RSEM version1.3.1*. The limma- 
voom” package was used to normalize gene expression levels and 
perform differential expression analysis with batch effect correction. 
Differentially expressed genes were called with FDR-corrected P< 0.05 
and fold change > 2 cutoffs, and results were visualized in R. 


ClinVar analysis 

The ClinVar variant summary was downloaded from NCBI (accessed July 
15,2019), and the information contained therein was used for all down- 
stream analysis. The list of all reported variants was filtered by allele 
ID in order to remove duplicates and by clinical significance in order 


to restrict the analysis to pathogenic variants. The list of pathogenic 
variants was filtered sequentially by variant type in order to calculate 
the fraction of pathogenic variants that are insertions, deletions, and 
so on. Single nucleotide variants (SNVs) were separated into two cat- 
egories (transitions and transversions) on the basis of the reported 
reference and alternate alleles. SNVs that did not report reference or 
alternate alleles were excluded from the analysis. 

The lengths of reported insertions, deletions, and duplications were 
calculated using reference/alternate alleles, variant start/stop posi- 
tions, or appropriate identifying information in the variant name. Vari- 
ants that did not report any of the above information were excluded 
from the analysis. The lengths of reported indels (single variants that 
include both insertions and deletions relative to the reference genome) 
were calculated by determining the number of mismatches or gaps 
in the best pairwise alignment between the reference and alternate 
alleles. Frequency distributions of variant lengths were calculated 
using GraphPad Prism 8. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


High-throughput sequencing data have been deposited to the NCBI 
Sequence Read Archive database under accession PRJNA565979. 
Plasmids encoding PE1, PE2 (same as PE3), and pegRNA expression 
vectors are available from Addgene. Previously described plasmids 
expressing sgRNAs are also available from Addgene, such as Addgene 
plasmid no. 65777. 


Code availability 


The script used to quantify pegRNA scaffold insertion is provided 
as Supplementary Note 4. 
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Extended Data Fig. 1| See next page for caption. 


Extended Data Fig. 1| In vitro prime editing validation studies with 
fluorescently labelled DNA substrates. a, Electrophoretic mobility shift 
assays with dCas9, 5’-extended pegRNAs and 5’-Cy5-labelled DNA substrates. 
pegRNAs1-5 contain a15-nt linker sequence (linker A for pegRNAI1 linker B for 
pegRNAs 2-5) between the spacer and the PBS, a5-nt PBS sequence, and RT 
templates of 7 nt (pegRNAs 1and 2), 8 nt (pegRNA 3), 15 nt (pegRNA 4), and 22 nt 
(pegRNAS). pegRNAsare those used ine andf; full sequences are listed 

in Supplementary Table 2. b, In vitro nicking assays of Cas9(H840A) using 
5’-extended and 3’-extended pegRNAs. Dataina, bare representative of n=2 
independent replicates. c, Cas9-mediated indel formation in HEK293T cells at 
HEK3 using 5’-extended and 3’-extended pegRNAs. Mean+s.d. ofn=3 
independent biological replicates. d, Overview of prime editing in vitro 
biochemical assays. 5’-Cy5-labelled pre-nicked and non-nicked dsDNA 
substrates were tested. sgRNAs, 5’-extended pegRNAs, or 3’-extended 
pegRNAs were pre-complexed with dCas9 or Cas9(H840A) nickase, then 
combined with dsDNA substrate, Superscript III M-MLV RT, and dNTPs. 
Reactions were allowed to proceed at 37 °C for 1h before separation by 
denaturing urea PAGE and visualization by Cy5 fluorescence. e, Primer 


extension reactions using 5’-extended pegRNASs, pre-nicked DNA substrates, 
and dCas9 lead to substantial conversion to RT products. f, Primer extension 
reactions using 5’-extended pegRNAsas in b with non-nicked DNA substrate 
and Cas9(H840A) nickase. Product yields are greatly reduced by comparison 
to pre-nicked substrate. g, Anin vitro primer extension reaction using a 
3’-pegRNA generates a single apparent product by denaturing urea PAGE. The 
RT product band was excised, eluted from the gel, then subjected to 
homopolymer tailing with terminal transferase (TdT) using either dGTP or 
dATP. Tailed products were extended using poly-T or poly-C primers, and the 
resulting DNA was sequenced. Sanger traces indicate that three nucleotides 
derived from the pegRNA scaffold were reverse-transcribed (added as the final 
3’ nucleotides to the DNA product). Note that pegRNA scaffold insertionis 
muchrarer in mammalian cell prime editing experiments than in vitro 
(Extended Data Fig. 6), potentially owing to the inability of the tethered RT to 
access the Cas9-bound guide RNA scaffold, and/or cellular excision of 
mismatched 3’ ends of 3’ flaps containing pegRNA scaffold sequences. Datain 
e-g are representative of n=2 independent replicates. For gel source data, 
see Supplementary Fig. 1. 
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Extended Data Fig. 2| Cellular repair in yeast of 3’ DNA flaps from in vitro 
prime editing reactions. a, Dual fluorescent protein reporter plasmids 
contain GFP and mCherry open reading frames separated by atarget site 
encoding an in-frame stop codon, a +1 frameshift, or a—1 frameshift. Prime 
editing reactions were carried out in vitro with Cas9(H840A) nickase, pegRNA, 
dNTPs, and M-MLVRT, then transformed into yeast. Colonies that contain 
unedited plasmids produce GFP but not mCherry. Yeast colonies containing 
edited plasmids produce both GFP and mCherry as a fusion protein. b, Overlay 
of GFP and mCherry fluorescence for yeast colonies transformed with reporter 
plasmids containing a stop codon between GFP and mCherry (unedited 
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negative control, top), or containing no stop codon or frameshift between GFP 
and mCherry (pre-edited positive control, bottom). c-f, Visualization of 
mCherry and GFP fluorescence from yeast colonies transformed with in vitro 
prime editing reaction products.c,d, Stop codon correction via TeA-to-AeT 
transversion using a 3’-extended pegRNA (c) or a5’-extended pegRNA (d).e, +1 
frameshift correction via a 1-bp deletion using a3’-extended pegRNA. f, -1 
frameshift correction viaa1-bp insertion using a3’-extended pegRNA. 

g, Sanger DNA sequencing traces from plasmids isolated from GFP-only 
colonies inb and GFP and mCherry double-positive colonies inc. Datainb-g 
are representative of n=2 independent replicates. 
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Extended Data Fig. 7 | Incorporation of pegRNA scaffold sequence into 

target loci. HTS data were analysed for pegRNA scaffold sequence insertion as 
described in Supplementary Note 4. a, Analysis for the EMX1 locus. Shown is the 
percentage of total sequencing reads containing one or more pegRNA scaffold 
sequence nucleotides within an insertion adjacent to the RT template (left); the 


percentage of total sequencing reads containing a pegRNA scaffold sequence 
insertion of the specified length (middle); and the cumulative total percentage 
of pegRNA insertion up to and including the length specified on the x-axis. b, As 
ina for FANCF.c, Asina for HEK3.d, As ina for RNF2. Mean +s.d. ofn=3 
independent biological replicates. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Effects of PE2, PE2-dRT, Cas9(H840A) nickase, and 
dCas9 on cell viability and on transcriptome-wide RNA abundance. 
HEK293T cells were transiently transfected with plasmids encoding PE2, 
PE2(R110S/K103L), Cas9(H840A) nickase, or dCas9, together with a HEK3- 
targeting pegRNA plasmid. Cell viability was measured for the bulk cellular 
population every 24 h after transfection for 3 days using the CellTiter-Glo2.0 
assay (Promega). a, Viability, as measured by luminescence, at 1, 2,or3 days 
after transfection. Mean +s.e.m. of n=3 independent biological replicates, 
each performed in technical triplicate. b, Percentage editing and indels for PE2, 
PE2(R110S/K103L), Cas9(H840A) nickase, or dCas9, together with a HEK3- 
targeting pegRNA plasmid that encodes a +5G-to-A edit. Editing efficiencies 
were measured on day 3 after transfection from cells treated alongside those 
used for assaying viability ina. Mean+s.d. of n=3 independent biological 
replicates. c-k, Analysis of cellular RNA, depleted for ribosomal RNA, isolated 
from HEK293T cells expressing PE2, PE2-dRT, or Cas9(H840A) nickase anda 
PRNP-targeting or HEXA-targeting pegRNA. RNAs corresponding to 14,410 


genes and 14,368 genes were detected in PRNPand HEXA samples, respectively. 
c-h, Volcano plot displaying the -log,) FDR-adjusted P value versus log,-fold 
change in transcript abundance for each RNA, comparing PE2 versus PE2-dRT 
with PRNP-targeting pegRNA (c), PE2 versus Cas9(H840A) with PRNP-targeting 
pegRNA (d), PE2-dRT versus Cas9(H840A) with PRNP-targeting pegRNA (e), 
PE2 versus PE2-dRT with HEXA-targeting pegRNA(f), PE2 versus Cas9(H840A) 
with HEXA-targeting pegRNA (g), PE2-dRT versus Cas9(H840A) with HEXA- 
targeting pegRNA (h). Red dots indicate genes that show twofold or more 
changes in relative abundance that are statistically significant (FDR-adjusted 
P<0.05).i-k, Venn diagrams of upregulated and downregulated transcripts 
(twofold change or more) comparing PRNP and HEXA samples for PE2 versus 
PE2-dRT (i), PE2 versus Cas9(H840A) (j), and PE2-dRT versus Cas9(H840A) (k). 
Values for each RNA-seq condition reflect the mean of n=5 biological 
replicates. Differential expression was assessed using a two-sided ¢-test with 
empirical Bayesian variance estimation. 
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Extended Data Fig. 9 | PE3-mediated correction of E6V-encoding HBB 
mutation and HEXA”*“" by various pegRNASs. a, Screen of 14 pegRNAs for 
correction of the HBB E6V-encoding allele in HEK293T cells with PE3. All 
pegRNAs evaluated convert the mutant HBB allele back to wild-type HBB 
without the introduction of any silent PAM mutation. b, Screen of 41 pegRNAs 
for correction of the HEXA”**™" allele in HEK293T cells with PE3 or PE3b. 


Those pegRNAs labelled HEXAs correct the pathogenic allele bya shifted 4-bp 
deletion that disrupts the PAM and leaves a silent mutation. Those pegRNAs 
labelled HEXA correct the pathogenic allele back to wild-type. Entries ending in 
buse an edit-specific nicking sgRNA in combination with the pegRNA (the PE3b 
system). Mean+s.d. of n=3 independent biological replicates. 
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Extended Data Fig. 10 | See next page for caption. 
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Extended Data Fig. 10 | PE3 activity in human cell lines and comparison of 
PE3 and Cas9-initiated HDR. a, Prime editing in K562 (leukaemic bone 
marrow), U20S (osteosarcoma), and HeLa (cervical cancer) cells. 

b-e, Efficiency of generating the correct edit (without indels) and indel 
frequency for PE3 and Cas9-initiated HDRin HEK293T cells (b), K562 cells 

(c), U20S cells (d), and HeLa cells (e). Each bracketed editing comparison 
installs identical edits with PE3 and Cas9-initiated HDR. Non-targeting controls 
are PE3 anda pegRNA that targets a non-target locus. (f) Control experiments 
with non-targeting pegRNA + PE3, and with dCas9 + sgRNA, compared with 
wild-type Cas9 HDR experiments confirming that ssDNA donor HDR template, 
acommoncontaminant that artificially elevates apparent HDR efficiencies, 


does not contribute to the HDR measurements ina-d. g, Example HEK3 site 
allele tables from genomic DNA samples isolated from K562 cells after editing 
with PE3 or with Cas9-initiated HDR. Alleles were sequenced onan Illumina 
MiSeqand analysed using CRISPResso2™. The reference HEK3 sequence from 
this regionis at the top. Allele tables are shown for anon-targeting pegRNA 
negative control,a+1CTT insertion at HEK3 using PE3, anda+1CTT insertion at 
HEK3 using Cas9-initiated HDR. Allele frequencies and corresponding Illumina 
sequencing read counts are shown for eachallele. All alleles observed with 
frequency >0.20% are shown. Mean +s.d. ofn=3 independent biological 
replicates. 
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Extended Data Fig. 11| Distribution by length of pathogenic insertions, 
duplications, deletions, and indels in the ClinVar database. The ClinVar 
variant summary was downloaded from NCBI on 15 July 2019. The lengths of 
reported insertions, deletions, and duplications were calculated using 
reference and alternate alleles, variant start and stop positions, or appropriate 
identifying information in the variant name. Variants that did not report any of 
the above information were excluded from the analysis. The lengths of 
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reported indels (single variants that include both insertions and deletions 
relative to the reference genome) were calculated by determining the number 
of mismatches or gaps in the best pairwise alignment between the reference 
and alternate alleles. a, Length distribution of insertions. b, Length 
distribution of duplications. c, Length distribution of deletions. d, Length 
distribution of indels. 
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Data collection Illumina Miseq Control software (3.1) was used on the Illumina Miseq sequencers to collect the high-throughput sequencing data 


Data analysis Crispresso2 was used to analyze HTS data for quantifying editing activity at genomic sites. Cell Sorter Software Version 3.0.5 was used 
for flow cytometry analysis. RNA-seq demultiplexing was performed with bcl2fastq2 version 2.20, and sequences were trimmed with 
TrimGalore v. 0.6.2. Alignment of RNA-seq reads to the human genome was performed with RSEM version 1.3.1. RNA-seq data output 
was genearted with limma-voom and visualized in R. Frequency, mean, and standard deviations were calculated using GraphPad Prism 8. 
Custom python scripts provided in Supplementary Note 4 were used to analyze and quantify guide RNA scaffold insertion. 
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- Accession codes, unique identifiers, or web links for publicly available datasets 
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High-throughput sequencing data have been deposited in the NCBI Sequence Read Archive database under accession code PRJNA565979. 
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Data exclusions No data was excluded. 
Replication All experiments were repeated at least once. All attempts at replication were successful. 


Randomization Yeast and mammalian cells used in this study were grown under identical conditions; no randomization was used. 


Blinding Yeast and mammalian cells used in this study were grown under identical conditions; blinding was not used. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 
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Eukaryotic cell lines Flow cytometry 
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Animals and other organisms 
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Clinical data 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) HEK293T (ATCC), U20S (ATCC), K562 (ATCC), HeLa (ATCC). 
Authentication Cells were authenticated by the supplier using STR analysis. 
Mycoplasma contamination All cell lines tested negative for mycoplasma. 


Commonly misidentified lines None used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals To generate dissociated neuronal cultures, timed-pregnant C57BL/6 mice were provided by Charles River. Pregnant mice were 
euthanized at E18.5, and tissue for dissociated cultures was harvested from all embryos. 

Wild animals The study did not involve wild animals. 

Field-collected samples The study did not involve samples collected from the field. 

Ethics oversight The Broad IACUC provided ethical guidance. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Methodology 


Sample preparation 2.5 x 104HEK293T cells grown in the absence of antibiotic were seeded on 48-well poly-D-lysine coated plates (Corning). 16-24 h 
post-seeding, cells were transfected at approximately 70% confluency with 1 UL of Lipofectamine 2000 (Thermo Fisher Scientific) 
according to the manufacturer’s protocols and 750 ng of PE2-P2A-GFP plasmid, 250 ng of pegRNA plasmid, and 83 ng of sgRNA 
plasmid. After 3 days post transfection, cells were washed with phosphate-buffered saline (Gibco) and dissociated using TrypLE 
Express (Gibco). Cells were then diluted with DMEM plus GlutaMax (Thermo Fisher Scientific) supplemented with 10% (v/v) FBS 
(Gibco) and passed through a 35-um cell strainer (Corning) prior to sorting. Cells were treated with 3 nM DAPI (BioLegend) 15 
minutes prior to sorting. 
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Instrument Sony LE-MA900 Cell Sorter 


Software Cell Sorter Software Version 3.0.5 (Sony) 


Cell population abundance Of the surviving single sorted HEK293T cells edited to have HEXA 1278+TATC, 3.02% were homozygous. Of the surviving single 
sorted HEK293T cells edited to have HBB E6V, 25% were homozygous. Cells were genotyped using next-generation sequencing 
(Illumina). 


Gating strategy HEK293T cells were initially gated on population using FSC-A/BSC-A (Gate A) and then sorted for singlets using FSC-A/FSC-H 
(Gate B). Live cells were sorted for by gating for DAPI-negative cells (Gate C). Finally the upper 50% of GFP expressing cells were 
sorted for using eGFP as the fluorochrome (Gate D). 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Features of higher-order chromatin organization—such as A/B compartments, 


topologically associating domains and chromatin loops—are temporarily disrupted 
during mitosis’*. Because these structures are thought to influence gene regulation, it 
is important to understand how they are re-established after mitosis. Here we 
examine the dynamics of chromosome reorganization by Hi-C after mitosis in highly 
purified, synchronous mouse erythroid cell populations. We observed rapid 
establishment of A/B compartments, followed by their gradual intensification and 
expansion. Contact domains form from the ‘bottom up’—smaller subTADs are formed 
initially, followed by convergence into multi-domain TAD structures. CTCF is partially 
retained on mitotic chromosomes and immediately resumes full binding in ana/ 
telophase. By contrast, cohesin is completely evicted from mitotic chromosomes and 
regains focal binding at a slower rate. The formation of CTCF/cohesin co-anchored 
structural loops follows the kinetics of cohesin positioning. Stripe-shaped contact 
patterns—anchored by CTCF—grow in length, which is consistent with a loop- 
extrusion process after mitosis. Interactions between cis-regulatory elements can 
form rapidly, with rates exceeding those of CTCF/cohesin-anchored contacts. 
Notably, we identified a group of rapidly emerging transient contacts between cis- 
regulatory elements in ana/telophase that are dissolved upon Gl entry, co-incident 
with the establishment of inner boundaries or nearby interfering chromatin loops. 
Wealso describe the relationship between transcription reactivation and 
architectural features. Our findings indicate that distinct but mutually influential 
forces drive post-mitotic chromatin reconfiguration. 


The global restructuring of chromosomalarchitecture during the pro- 
gression from mitosis into G1 phase provides an opportunity to examine 
hierarchies and mechanisms of chromosome organization? (Extended 
Data Fig. 1a). We performed in situ Hi-C experiments‘ at defined time 
points after mitosis following nocodazole-induced prometaphase 
arrest-release in GIE-ER4 cells, a well-characterized subline’ of the 
mouse erythroblast line GIE (Fig. 1a). To ensure maximal purity of cell 
populations, we used a fluorescence-activated-cell-sorting (FACS)- 
based isolation strategy based on cell cycle markers and DNA content 
(Extended Data Fig. 1b, c; Supplementary Methods). In situ Hi-C collec- 
tively yielded around 2 billion uniquely mapped interactions, with high 
concordance between biological replicates (Extended Data Fig. 1d-f). 
Consistent with previous studies, compartments are largely eliminated 
in prometaphase’? (Fig. 1b). In ana/telophase—the earliest examined 
interval—compartments are already detectable visually and by eigen- 
vector decomposition, and gain in intensity as cells advance into G1 
(Fig. lb—d; Extended Data Fig. 2a—c). This is consistent with the results 
ofa multiplexed 4C-seq study, which reported the early establishment 


of compartments after mitosis°. As expected, the A-type compartment 
is associated with active histone marks’ (Extended Data Fig. 2d). As 
cells proceed towards late G1, the characteristic checkerboard pattern 
of compartments visually expands away from the diagonal, leading to 
increased interaction frequencies at large (>100 Mb) distance scales 
(Fig. 1b, Extended Data Fig. 2e, f). Quantification of compartmentali- 
zation at different genomic distance scales across all cell cycle stages 
revealed a progressive gain of compartmentalization among distant 
(>100 Mb) genomic regions, confirming the expansion of compart- 
ments after mitosis (Extended Data Fig. 2g-i; Supplementary Methods). 
Therefore, a major reconfiguration of genome structure occurs during 
the prometaphase-GI phase transition, involving a rapid establish- 
ment, progressive strengthening, and expansion of A/B compartments 
throughout the chromosome. 

Next we examined the formation of topologically associating 
domains (TADs) and nested subTADs after mitosis using 3DNetMod®. 
We identified a total of 8,082 contact domains that are progressively 
gained from prometaphase to mid Gl (Fig. 2a; Supplementary Table 1). 
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Fig. 1| Early appearance and progressive strengthening and expansion of 
A/Bcompartments after mitosis. a, Schematic showing the reporter gene 
encoding mCherry fused to the mouse cyclin B mitotic degradation domain 
(mCherry-MD) and the expected mCherry signal at each cell cycle stage. Green 
arrowheads indicate sorting of cells in anaphase or telophase (ana/telo). Asyn, 
asynchronous; prometa, prometaphase. b, Hi-C contact maps showing the 
restoration of chromatin A/B compartments of chromosome 1 (chr) after 
mitosis, along with genome browser tracks showing eigenvector 1 values. Bin 
size, 250 kb. Arrows indicate expansion of compartments.c, A magnified view 
(chr1: 87.3 Mb-138.3 Mb) of b revealing the clear plaid-like compartment 
pattern inana/telophase. d, Saddle plots showing the genome-wide 
compartment strength over time. 
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Establishment of boundaries and enrichment of intradomain interac- 
tions were observed at newly emerging domains, thereby validating our 
domain-calling approach (Extended Data Fig. 3a—e). Previous studies 
have reported acomplete loss of domains in prometaphase’”. However, 
despite considerable attenuation, residual domain- and boundary-like 
structures are still detectable visually and algorithmically in prometa- 
phase cells (Extended Data Fig. 3f). To rule out contamination by G1 cells 
as a cause of prometaphase domain detection, we simulated in silico 
admixing with up to 20% of G1 chromosomes. Even a Gl contribution 
of 20%—which far exceeds the observed interphase cell contamination 
of up to 2%—did not reproduce patterns observed in prometaphase 
(Extended Data Fig. 3f-h); this suggests that prometaphase domain- 
and boundary-like features are not due to the presence of G1 phase cells. 
Residual domain boundaries in prometaphase are enriched with active 
histone marks and transcription start sites’ (Extended Data Fig. 3i, j). 

Formation of nested domain structures may occur through the con- 
vergence of previously emerged subTADs (bottom-up), the partitioning 
of initially formed TADs into subTADs (top-down), or the simultaneous 
appearance of both contact domain types (Extended Data Fig. 4a). On 
average, contact domains that are established at time points later inG1 
are larger than those called at earlier stages of the cell cycle (Fig. 2a, b); 
this observation favours the bottom-up formation scenario. To further 
test this model, we categorized all contact domains into 2,899 TADs and 
5,183 subTADs on the basis of their hierarchical organization (Fig. 2c). 
Notably, higher proportions of subTADs are detected in prometaphase 
or ana/telophase compared to the TADs that encompass them, which 
suggests that subTADs tend to assemble more rapidly (Fig. 2c). Once 
established, the majority of TADs remain unchanged without further 
subdivisions, disfavouring the ‘top-down’ model (Extended Data 
Fig. 4b). By contrast, 85.4% and 69.1% of subTADs called in prometa- 
phase and ana/telophase, respectively, converge into larger domains 
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Fig. 2| Contact domains develop from the bottom up after mitosis. a, Hi-C 
contact maps coupled with insulation score tracks (chr2: 57.5 Mb-63.5 Mb). 
Domains emerging at each stage of the cell cycle are demarcated by colour- 
coded lines. Bin size, 10 kb. Colour bar denotes g-normed reads. b, Sizes of 
domains newly detected at prometaphase (n=1,528), ana/telophase (n=2,394), 
early G1 (n=2,995) and mid G1 (n=1,165). For all box plots, centre lines denote 
medians; box limits denote 25th-75th percentile; whiskers denote 5th—-95th 
percentile. Pvalues were calculated by atwo-sided Mann-Whitney U-test. 

c, Left, schematic showing the partition of domains into TADs or subTADs. 
TADs are domains that are not encompassed by any other domains; subTADs 
are domains that are completely encompassed by other domains. Right, pie 
charts of the cell cycle distribution of subTADs and TADs that contain at least 
one subTAD based on their time of emergence. P values were calculated using a 
two-sided Fisher’s exact test (prometaphase + ana/telophase compared with 
early G1+mid Gl). 


during later stages (Extended Data Fig. 4c). Inline with subTAD merging, 
we observed gains in contacts across subTAD boundaries over time 
(Extended Data Fig. 4d). Accordingly, a substantial portion of subTAD 
boundaries detected at prometaphase exhibit increased insulation 
scores (indicative of reduced insulation), whereas for most TAD bounda- 
ries, insulation scores decrease as cells progressed from prometaphase 
into Gl (Extended Data Fig. 4e). Independent algorithms yielded similar 
trends of subTAD merging after mitosis®”° (Extended Data Fig. 4f-m). 
Together, these analyses suggest a ‘bottom-up’ model of hierarchical 
domain reorganization during the prometa-to-G1 phase transition. 
Aloop-extrusion model has been proposed to explain the formation 
of TADs and chromatin loops, wherein the cohesin complex extrudes 
the chromatid until itencounters pairs of convergently oriented CTCF- 
binding sites". Because cell cycle dynamics of loop formation as 
well as CTCF and cohesin binding could inform this (or alternative) 
models, we surveyed the chromatin-binding profiles of CTCF and 
cohesin by chromatin-immunoprecipitation followed by sequencing 
(ChIP-seq). We generated highly concordant replicates (Extended 
Data Fig. 1g, h) and identified 41,699 CTCF-binding sites and 22,003 
binding sites for Rad21, a cohesin subunit (Supplementary Table 2). 
Approximately 88.7% (19,520) of Rad21 peaks were co-occupied by 
CTCF. Notably, around 18.6% (7,741) of CTCF peaks are reproducibly 
detected in prometaphase cells, indicating that a considerable amount 
of CTCF remains bound to mitotic chromatin (Extended Data Fig. 5a, c, 
d). Previous reports have described varying degrees of CTCF mitotic 
retention”. Unlike CTCF, Rad21 failed to show localized chromatin 
binding during prometaphase (Extended Data Fig. 5b-d). Motif scan 
and genomic distribution analysis failed to identify distinct features 
associated with CTCF peaks present in both interphase and mitosis 
(IM-peaks) (Extended Data Fig. 5e, f). However, IM-peaks tend to be 
more tissue invariant and are more likely to be co-occupied by Rad21 
during interphase (Extended Data Fig. 5f). CTCF and cohesin resume 
chromatin occupancy after mitosis with markedly different kinetics. 
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Fig. 3 | Focal accumulation of cohesin is delayed compared to that of CTCF 
and coincides with structural loop formation. a, Venn diagrams showing the 
distribution of CTCF and Rad21 ChIP-seq peaks across cell cycle stages. b, Box 
plots showing the recovery rate of CTCF (n=33,306) and Rad21 (n=18,859) 
peaks. Peaks absent from late Gl were omitted from the analysis. For all box 
plots, centre lines denote medians; box limits denote 25th-7Sth percentile; 
whiskers denote 5th-95th percentile. Pvalues were calculated using a two- 
sided Mann-Whitney U-test.c, Genome browser tracks of CTCF and Rad21 at 
the Lonrf2locus across cell cycle stages. n= 2-3 biological replicates. Blue and 
yellow arrows indicate IM- and interphase-only (IO)-CTCF binding sites, 
respectively. d, Schematic depicting the classification of loops. All loops with 
CTCF/cohesin co-occupancy at both anchors were subdivided into those with 
0, lor 2 anchors marked by cis-regulatory elements. Those with 0 or 1 were 
operationally defined as structural loops. e, Heat map showing the result of 
k-means clustering onthe 4,712 structural loops. f, Hi-C contact maps showing 
arepresentative region that contains a cluster 1 structural loop (chr2:167.4 Mb- 
167.9 Mb, black arrows), along with genome browser tracks of CTCF and Rad21 
ChIP-seq profiles. Rad21 peaks at two loop anchors are indicated by red 
arrowheads. Chevron arrows highlight positions and orientations of CTCF sites 
at the loop anchors. Bin size, 10 kb. g, Capture-C interaction profile of the same 
regionas shown inf. n =3 biological replicates. The anchor symbol shows 
position of the capture probe. h, i, similar to f, g, showing a representative 
region that contains a cluster 3 (slowly emerging) structural loop 

(chr1: 50.6 Mb-52.0 Mb, black arrows). 
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The majority of CTCF peaks are immediately restored in ana/telophase, 
whereas Rad21 peaks appear much more gradually (Fig. 3a—c; Extended 
Data Fig. 5g-i). Delayed nuclear import as well as chromatin loading 
and/or movement along the chromatid could account for the slow 
focal accumulation of cohesin after mitosis. We performed live-cell 
imaging on asynchronous G1E-ER4 cells that endogenously express 
mCherry-tagged CTCF or mCherry-tagged SMC3 (a cohesin subunit) 
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(Extended Data Fig. 5j). Consistent with the ChIP-seq data and a previ- 
ous report’, CTCF rapidly accumulates on telophase chromosomes, 
whereas SMC3 is excluded from chromosomes during metaphase, 
telophase and cytokinesis (Extended Data Fig. 5k). Moreover, after G1 
entry, nuclear import of SMC3 is also slower compared to that of CTCF 
(Extended Data Fig. 5k, |). These results suggest that the delayed kinetics 
of focal cohesin accumulation might be a composite of nuclear import, 
association with chromatin, and migration along the chromatid. 

The transient decoupling of cohesin from CTCF during mitotic exit 
offers the opportunity to separately assess their roles in post-mitotic 
loop formation. Using a modified HICCUPS algorithm we identified 
13,317 chromatin loops, progressively gained from prometaphase to 
late G1, with highly concordant loop strength between biological rep- 
licates (Extended Data Fig. 6a—c; Supplementary Table 3). Of these 
loops, 6,285 (about 47.2%) contain CTCF and cohesin co-occupied sites 
at both anchors (Fig. 3d). These loops were further filtered to eliminate 
interactions between putative cis-regulatory elements (for example, 
enhancer-promoter loops), resulting in 4,712 operationally defined 
‘structural’ loops (Fig. 3d). To investigate how fast structural loops 
are formed we performed k-means clustering, which revealed three 
clusters with distinct formation dynamics (Fig. 3e). Cluster 1 loops 
display strong interactions in ana/telophase, whereas the formation of 
cluster 2 and 3 loops is delayed (Fig. 3e, f, h; Extended Data Fig. 6d, e). 
Analysis by Capture-C” validated the differential dynamics of structural 
loops at two representative loci (Fig. 3g, i). Notably, anchors of cluster 1 
loops show enrichment of Rad21 at ana/telophase, whereas anchors of 
cluster 2 and 3 loops acquire Rad21 more gradually (Fig. 3f, h; Extended 
Data Fig. 6d, e). By contrast, CTCF is rapidly enriched at anchors of all 
three loop clusters (Fig. 3f, h; Extended Data Fig. 6d, e). The strengths 
of structural loops are highly correlated with ChIP-seq signals of Rad21 
at their anchors over time, but significantly less so with those of CTCF 
(Extended Data Fig. 6f). Late-occurring structural loops are signifi- 
cantly larger than earlier ones, suggesting a correlation between size 
and time to formation (Extended Data Fig. 6g). Together, our results 
reveal three clusters of structural loops with distinct formation dynam- 
ics, and suggest that the accumulation of cohesin—but not CTCF—is 
rate-limiting for the formation of structural loops after mitosis. 

Stripes in the contact maps are thought to reflect interactions 
between a single locus and a continuum of genomic regions, and are 
considered as evidence for the loop extrusion model”. Using a modified 
statistical modelling approach”, we identified 1,775 stripes genome- 
wide. The majority of them contain inwardly oriented CTCF sites at 
their anchors (Extended Data Fig. 7a). Notably, these striped contacts 
grow directionally over time but display punctuated enrichment at 
select CTCF sites (Extended Data Fig. 7b, d). This is consistent with 
an extrusion mechanism in which some CTCF-binding sites serve as 
obstacles to cohesin processivity. We also observed blockage of stripe 
extension that correlates with the presence of strong CTCF-binding 
sites, resulting in the formation of structural loops at the far end of 
the stripes (Extended Data Fig. 7b). Together, our data are consistent 
with dynamic loop extrusion after mitosis. Stripe-like patterns that 
appear rapidly with little or no further growth were also observed, and 
are discussed below (Extended Data Fig. 7c, e, f). 

Next we investigated interactions between cis-regulatory elements. 
We identified 3,812 chromatin loops with both anchors marked by pro- 
moters or putative enhancers, which we termed E/P loops (Fig. 4a). This 
number is probably an underestimate because short range E/P loops 
can escape detection. Notably, a considerable portion (approximately 
58.7%, 2,239) of E/P loops have only one or no anchor that co-localizes 
with CTCF and cohesin co-occupied sites, suggesting that E/P loops 
may form by a mechanism other than CTCF/cohesin-mediated loop 
extrusion (Fig. 4a). These seemingly CTCF/cohesin independent E/P 
loops are intensified significantly faster than structural loops (Fig. 4b, 
Extended Data Fig. 6h). Note that the faster formation of E/P loops 
compared to structural loops is not explained by differences in loop 
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Fig. 4| cis-Regulatory contacts are established rapidly after mitosis andcan 
betransient. a, Schematic depicting the classification of loops. E/P loops were 
subdivided into those with O, 1 or 2 anchors containing CTCF/cohesin co- 
occupied sites. Those with O or lanchor co-occupied by CTCF/cohesin were 
classified as E/P loops independent from CTCF and cohesin. b, Aggregated peak 
analysis of CTCF/cohesin independent E/P loops (middle and bottom) in 
comparison to structural loops (top). Bin size, 10 kb. Numbers indicate average 
loop strength: In(observed/expected).c, Heat map of k-means clustered E/P 
loops. d, Hi-C contact maps of arepresentative region (chr2:44.7 Mb-45.1 Mb) 
containing cluster 1E/P loops (green arrows), coupled with browser tracks of 
CTCF and Rad21 occupancy. Bin size, 10 kb. The colour bar denotes q-normed 
reads. Tracks of H3K4me3, H3K4mel, H3K27ac and annotations of cis-regulatory 
elements were from asynchronously growing G1E-ER4 cells. e, Similar tod,a 
representative region (Commd3 locus, chr2:18.4 Mb-19.4 Mb) containing a 
cluster 3 (transient) E/P loop (red arrows). Blue arrows denote the formation of a 
downstream, potentially interfering structural loop. Purple arrowheads indicate 
CTCF/cohesin binding at the potentially interfering structural loop anchor. 


size (Extended Data Fig. 6i). Accordingly, among loops established 
in ana/telophase, about 69.3% are E/P loops whereas only 11.6% are 
structural loops (Extended Data Fig. 6j). These trends are reversed in 
mid G1 (18.4% E/P and 42.3% structural loops, respectively). Hence, E/P 
loops may not require CTCF and cohesin, and can be rebuilt faster than 
structural loops after mitosis. 

Clustering all E/P loops on the basis of their time of enrichment 
yielded at least three classes with distinct post-mitotic formation 
kinetics. Cluster 1 (2,211, 58%) E/P contacts are rapidly enriched in ana/ 
telophase, whereas cluster 2 contacts (1,201, 31.5%) form in early G1 
(Fig. 4c, d; Extended Data Fig. 8a, b). We also discovered a third cluster 
(400, 10.5%) of E/P loops that peak early in ana/telophase and gradu- 
ally diminish in G1 (Fig. 4c, e; Extended Data Fig. 8c, d, f). We indepen- 
dently validated this transient nature between certain cis-regulatory 
elements by Capture-C at the two manually identified loci Pde12 and 
Morc3 (Extended Data Fig. 8c, e). Inan effort to understand the mecha- 
nisms that underlie this subset of transient E/P loops, we noticed that 
approximately 55% of them span either a boundary or an anchor of a 
nearby structural loop that is established later in G1 (Fig. 4e, Extended 
Data Fig. 8c). Moreover, these boundaries and loop anchors within clus- 
ter 3 E/P loops display more substantial insulation compared to those 
within clusters 1 or 2 (Extended Data Fig. 8g). We therefore speculate 
that emerging boundaries or nearby structural loops may interfere with 
E/P loops (Extended Data Fig. 1a). To test this hypothesis, we set out to 
assay cluster 3 E/P loop dynamics after perturbing the nearby structural 


loop. We focused on the interaction between the Commd3 promoter 
anda distal cis-regulatory element. We deleted the CTCF core motif of 
a potential interfering structural loop anchor, resulting in the abroga- 
tion of CTCF and Rad21 binding (Extended Data Fig. 8f, h, i). Notably, in 
the mutant cells, interactions between the Commd3 promoter and the 
distal cis-regulatory element are prolonged after mitosis, compared to 
controls (Extended Data Fig. 8j-l). These results provide a precedent 
for a dynamic interplay between structural and E/P loops. However, 
insulation between regulatory elements is unlikely to fully explain the 
transient nature of cluster 3 E/P loops, because only around 55% of them 
span boundaries or interfering loop anchors. Additional mechanisms, 
suchas competition between regulatory elements, may also contribute 
to the transient nature of cluster 3 E/P loops. In summary, we identi- 
fied a special class of transient E/P contacts after mitosis, which may 
insome cases be broken by CTCF and cohesin. 

To explore the relationship between chromatin organization and 
transcription activation’ after mitosis, we carried out Pol II ChIP-seq” 
(Extended Data Fig. li). Transcription is largely silenced in prometa- 
phase, but rapidly reinitiates in ana/telophase and positively correlates 
with A-type compartments (Extended Data Fig. 9a, b). Collectively, we 
identified 7,535 active genes after mitosis (Supplementary Table 4). 
Genes display comparable reactivation dynamics regardless of whether 
they are located in domains called at early or later stages of the cell cycle, 
suggesting that domain formation may exert only limited influence on 
gene reactivation after mitosis (Extended Data Fig. 9c). We then strati- 
fied active genes onthe basis of their Pol Il occupancy over time through 
principal component analysis”. In a previous study we observed that 
a large fraction of genes acquires strong Pol II occupancy early after 
mitosis, followed by a reduction in signal intensity. This ‘spike’ in gene 
reactivation manifests as the first principal component (PC1) and sepa- 
rates ‘spiking’ genes from late-activating genes”. Similarly, the current 
data recapitulate this transient hyperactivation as represented by PC1 
(Extended Data Fig. 9d-f). To examine the relationship between gene 
spiking and E/P loop formation, we began by stratifying all active genes 
on the basis of whether or not they are positioned at E/P loop anchors 
(Extended Data Fig. 9g, h). In general, formation of E/P loopsis positively 
correlated with Pol Il occupancy over time (median Pearson r= 0.65). 
Additionally, we found that genes at cluster 3 E/P loops are more likely 
to display post-mitotic transcriptional spiking compared to those at 
cluster 1 or 2 loops, or those with no detectable E/P loops (Extended 
Data Fig. 9i,j). Regarding genes associated with cluster 1 or 2E/P loops, 
activation was also positively correlated with loop strength over time 
(median Pearsonr~= 0.67). These results suggest that transient E/P loops 
may contribute to post-mitotic gene spiking. However, a caveat to this 
interpretation is that a much larger number of genes spike than are 
associated with transient E/P loops. This suggests that E/P contacts 
cannot be solely responsible for spiking in post-mitotic transcription. 
Nonetheless, although the causal relationship between gene spiking and 
transient E/P loops remains uncertain, the overall positive correlation 
between E/P loop strength and Pol II occupancy over time suggests a 
potential role of E/P contacts in transcription after mitosis. 

We exploited the natural transition from a relatively unorganized 
state (prometaphase) into fully established chromatin organization 
late in G1to interrogate mechanisms by which chromatin is hierarchi- 
cally organized (Extended Data Fig. 1a). We showed that A/B compart- 
mentalization was disrupted in prometaphase despite histone marks 
being largely maintained”°. We also show that local (around 10 Mb) 
compartmentalization of chromatin initiates rapidly after mitosis, and 
continues to expand and increase in strength. Study of the cell cycle 
dynamics of chromatin also enabled the testing of predictions made 
by the loop extrusion model. First, small TADs and structural loops 
are formed more quickly than larger ones. Second, stripes in the con- 
tact maps increase in length over time. Third, based on the kinetics of 
CTCF and cohesin deposition on chromatin, it is clear that CTCF does 
not form detectable loops without cohesin even though it can form 
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multimers”. However, it is possible that CTCF pairs with itself—or with 
other factors suchas YY1””*—to facilitate the establishment of contacts 
among cis-regulatory elements, such as those observed at early time 
points independently of cohesin. 

Our integrative analysis of loops and histone-modification profiles 
reveals a group of E/P loops that can be independent from CTCF and 
cohesin co-binding. A distinctive feature of E/P loops is their fast appear- 
ance compared to structural loops. It is possible that E/P contacts form 
via collisions of chromatin regions with similar epigenetic states. This 
is supported by our observation that their post-mitotic recovery rate 
positively correlates with the intensity of active histone marks at anchors 
(Extended Data Fig. 8m). Itis noteworthy that 16.4% of stripe-like struc- 
tures that lack inwardly oriented CTCF display only little or no further 
growth during G1 phase and are highly enriched for histone H3 Lys27 
acetylation at their anchors (Extended Data Fig. 7c, e, f). Loop extrusion 
is unlikely to account for these types of stripe-shaped contact. Instead, 
these contacts might represent small compartments, defined by local 
enrichment of transcription factors and chromatin modifications”. 
Similarly, transient E/P loops might result from less discriminatory 
affinity among regions with similar chromatin states. In summary, our 
findings describe a dynamic hierarchical framework of post-mitotic 
chromatin configuration that supports a bottom-up model for the for- 
mation of contact domains, implicates CTCF and cohesinin post-mitotic 
loop extrusion, and identifies extrusion independent pathways that 
lead to compartmentalization and contacts of cis-regulatory networks. 
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Extended Data Fig. 1| See next page for caption. 
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Extended Data Fig. 1| Models, experimental workflow and data quality 
control. a, From top to bottom: schematic illustration of the early emergence, 
gradual intensification and expansion of A/B compartments (checkerboards) 
from prometaphase to late G1 phase, coupled with schematics of chromatin 
organization; subTADs (small triangles) emerge first after mitotic exit, 
followed by convergence intoa TAD (big triangle); formation ofa structural 
loop coincides with the positioning of cohesin, but not CTCF after mitosis; the 
gradual extrusion of cohesin complex along DNA fibre from one anchor point 
with CTCF, reflected as enrichment of interactions between the anchor anda 
continuum of DNA loci onthe contact map; fast formation of E/P loops after 
mitosis; the interplay between transient E/P loops and boundaries or structural 
loops. b, The experimental workflow. Representative flow cytometry plots 
showing the nocodazole arrest-release strategy based on pMPM2 
(prometaphase), mCherry-MD signal, and DNA content (DAPI) staining. 
Similar observations were made in more than 5 independent experiments. 


c, Representative images showing DAPI and lamin B1 staining of FACS-purified 
cells across all stages of the cell cycle. Similar observations were made in 2 
independent experiments. The mitotic index of prometaphase cells after FACS 
purification is on average greater than 98%. Yellow and white arrowheads 
indicate anaphase and telophase cells, respectively. Scale bar, 10 um. d, Hexbin 
plots showing the high correlation of Hi-C raw read counts between two 
biological replicates across all stages of the cell cycle. Bin size, 250 kb. e, Heat 
map showing the Pearson correlation among all Hi-C samples, based onthe 
eigenvector 1 of 250 kb bins. f, Heat map showing the Pearson correlation 
amongall Hi-C samples based onrawread counts. Bin size, 250 kb. g-i, Heat 
maps showing Pearson correlation of CTCF (g), Rad21 (h) and Pol II (i) ChIP-seq 
data amongall samples. Note the overall high replicate concordance. Low 
correlation coefficients among replicates were only observed insamples with 
low signal-to-noise ratios—for example, in prometaphase. 


Article 


a b : c 
prometa ana/telo early G1 mid G1 late G1 reaniey replicate1 
rank: EV1 g 10. 
_—_ 
2 = SE 8 
o R °° 
sie = 2 8 6 
2 3 || @ m SS 4 
& = 0 3 < c= 2 
rat <| = = ie 
A Slag 2 2 0 
Q 
| 2 compartment__ median (AA) + median (BB) 2 KRCMECY 
strength =" ‘Imedian (AB) + median (BA) s Se 
d 
rank: EV1 replicate2 
_——_——= 
2 H3K4me1 5 & 10 
“ 12 H3K4 § BE 8 
2 a ||’ mes e g 
3 5 g 5 BB 6 
& ® |lo 8 #9 
= m S$ H3K27ac c 55 4 
e 3 hg E £5 
2 H3K36me3 § Bo 2 
2 G@ E80 
H3K9me3 8 Ki sik 
© 
SS 
Ss 8 
¥ 
e f R 
10-1 J prometa ana/telo early G1 late G1 
E 10! 101 101 10! 
. gi i i i 
1024 2 . f ‘ 
| S | 104 10-4 104 104 
> 3B | 10° 105 105 10° 
= 1034 2B ®@ | 106 106 106 10°. 
a E 3S 107 1 0% 107 107, 
3 q 8 10°. 10°. 108+ “J 108 104 
5 1044 ° 10°10510610/108 104105108107108 10410810810/10810410°105107108 1010510107108 
8 3 === prometa fo 
2 | === ana/telo 8 1071 101 107 107 10" 
5 1054 = y | 102 102 102 102 102 
3 == early G1 8 & | 103 108 108 103 108 
q oO 5 - ; - 
1084 — miaet 8 | 10s 108 109 108 108 
| @ | 10%. 106: 10°. 106: 106: 
eee latiGl “| 107 107 107 107 107 
107+ 10° 10 10-8 -remrrmerrimre—rem | Q) 10-8 errr rer 
404 408 1408 107 408 104 hear OFC ort 08 104 ON 0 1071 108 104105108107108 10m 1O°10°1C ont 08 10410°10%107108 
genomic distance (bp) genomic distance (bp) 
9g Well compartmentalized region -> poorly compartmentalized region -> 
High correlation: product of PC1 of two bins vs. obs/exp low correlation: product of PC1 of two bins vs. obs/exp 
genomic distance s genomic distance s 
PCI of bint X PC1 of bin2 _lobs/ex between bins: eg. 1Mb between bins: eg. 100Mb 
ee =r ooh P| PcisopcI<o _RC1>0 Pct <o PC1 of bin1 X PC1 of bin2_|obs/exp! 
= Pal 5 A nee SEE! bin pair 1 >0 low 
mbar = ie bin pairt [bin pairt bin pair 2 >0 low 
bin pair 3 >0 high ° bin pair 3 <0 low 
bin pair 4 >0 high | 4-5 bin pair 4 ZO =m 
bin pair 5 <0 low 5 bin pair 5 <0 low 
bin pair 6 <0 low bin pair 6 <0 low 
bin pall) =o low bin pair 7 <0 low 
PC1 of bint X PC1 of bin2 
* nee PC1 of bint X PC1 of bin2 
e bin pair4 
bin pair2 
2 Tepes. ; ; bin pairt 
> obs/exp —}> A(s): Spearman correlation of : bin pair2 
bin paitS, obs/exp vs. (PC1 of bint X PC1 of bin2) A(s): Spearman correlation of : obs/exp 
bin pair6 * obs/exp vs. (PC1 of bint X PC1 of bin2) "bin pairS *. bin pair3 
bin pair7 * iene s bin pair4 
h level of compartmentalization per chromosome at indicated genomic distances over time 
© prometa 
© ana/telo 
e early G1 
C) mid G1 
e late G1 


= Gs chr1 
ae 
G 
pI 
g 
§& 0.54 ‘= prometa 
— wm ana/telo 
& m early G1 
8 0.0 = mid G1 
6 mlate G1 
g 
205 r 1 T T 1 

0 25 50 75 100 125 


genomic distance (Mb) 


Extended Data Fig. 2|See next page for caption. 


Extended Data Fig. 2| Compartment strengthening and expansion from 
ana/telophase throughout late G1. a, Saddle plots showing the progressive 
gain of compartment strength over time in two biological replicates. 

b, Schematic showing the calculation of compartment strength. c, Line graphs 
showing the progressive increase of compartment strength of each individual 
chromosome (represented by dots) in two biological replicates. d, Heat map 
showing the genome-wide Spearman correlation coefficients between 
eigenvector 1 values and asynchronous-cell-derived ChIP-seq signals for the 
indicated histone marks. e, Plots of chromosome-averaged distance- 
dependent contact frequency (P(s)) at all stages of the cell cycle. f, P(s) plots of 
each individual chromosome (two biological replicates). g, Aschematic 
illustrating how compartmentalization levels (R(s)) were calculated at different 


distance scales (for example, 1 Mb or 100 Mb). Each dotted line indicates a 
series of 250-kb bin-bin pairs that are separated by a given genomic distances 
(the distance from the diagonal to the dotted line). For all bin—bin pairs 
separated by distance of s,a Spearman correlation coefficient R(s) was 
generated between observed/expected and the product of two eigenvector1 
values (PC1 (bin1) x PC1 (bin2)). R(s) is expected to be high in well- 
compartmentalized regions (left) and low at large distance scales with no 
compartments (right). h, Replicate-averaged R(s) of each individual 
chromosome across all stages of the cell cycle whens is equal to 10,50 and 
125 Mb (only eight chromosomes were computed at the 125-Mbscale).i, Line 
graph showing the level of compartmentalization of chrl against genomic 
distance at each stage of the cell cycle. 
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Extended Data Fig. 3 | See next page for caption. 


Extended Data Fig. 3 | Domain detection and residual ‘domain-like’ 
structures in prometaphase. a, b, Meta-region plots and density heat maps of 
insulation scores (a) and directionality index (b) centred around domain 
boundaries initially detected at each stage of the cell cycle.c, Scatter plots 
showing Pearson correlations of insulation scores at domain boundaries 
between two biological replicates. d, Aggregated domain analysis (ADA) of 
domains initially detected at each stage of the cell cycle. e, Box plots showing 
ADAscores over time for domains initially detected at prometaphase 
(n=1,360), ana/telophase (n=2,260), early G1 (n=2,875) and mid G1 (n=1,112). 
For all box plots, centre lines denote medians; box limits denote 25th-75th 
percentile; whiskers denote 5th-95th percentile. Pvalues were calculated using 
atwo-sided Mann-Whitney U-test. The dotted line indicates the average ADA 
score of initial domain detection. f, Hi-C contact maps of two representative 


regions (chr8: 113 Mb-114 Mb and chr9: 72 Mb-73 Mb) showing residual 
domain- and boundary-like structures (yellow lines) in prometaphase in 
merged and individual biological replicates. Bin size, 10 kb. g, Simulated 
featureless, per cent ‘Gl contaminated’, and early G1 contact maps of the same 
regions as inf. Bin size, 10 kb. h, Meta-region plots showing the insulation 
scores of prometaphase, simulated featureless, ‘Gl-contaminated’ and early G1 
samples, centred around prometaphase boundaries in chr8 and chr9. i, Meta- 
region plots showing indicated histone modification profiles centred around 
boundaries newly detected at each stage of the cell cycle.j, Bar graphs showing 
the enrichment of transcription start sites (overall, housekeeping and tissue- 
specific’) within +20 kb of boundaries newly detected at each stage of the cell 
cycle. 
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Extended Data Fig. 4 | See next page for caption. 


Extended Data Fig. 4 | Dynamics of TAD and subTAD after mitosis. 

a, Schematic of possible models of hierarchical domain formation: bottom-up 
(merge), top-down (split) and concomitant. b, Bar graphs showing the fraction 
of TADs that display either type of behaviour after detection. c, Bar graphs 
showing the fraction of subTADs that display each of the four potential 
behaviours after detection: merge, split, merge and split, and static.d, Bottom, 
schematic showing partitioning of boundaries into TAD and subTAD 
boundaries. Top, Hi-C contact maps showing the change in insulation of 
representative TAD and subTAD boundaries from ana/telophase to late G1. 
SubTAD and TAD boundaries are indicated by green and blue arrows, 
respectively. Bin size, 10 kb. e, Bin plots showing the change in insulation score 
over time of TAD boundaries (top) and subTAD boundaries (bottom) that are 
detected at prometaphase in merged replicates and in two biological 
replicates. f, Box plots showing sizes of domains initially detected at 
prometaphase (n= 2,494), ana/telophase (n =1,699), early G1 (n=1,357) and mid 
G1 (n= 682) by rGMAP. For all box plots, centre lines denote medians; box limits 


denote 25th-7Sth percentile; whiskers denote 5th-9Sth percentile. Pvalues 
were calculated using atwo-sided Mann-Whitney U-test. g, Pie charts of the cell 
cycle distribution of subTADs and TADs that contain at least 1subTAD based on 
their time of emergence (called by rGMAP). The Pvalue was calculated using a 
two-sided Fisher’s exact test (prometaphase + ana/telophase compared with 
early G1+ mid G1).h, Bar graphs showing the fraction of subTADs detected by 
rGMAP that display each of the four potential behaviours after detection: 
merge, split, merge and split, and static. i, Bin plots showing the change in 
insulation score of TAD boundaries (left) and subTAD boundaries (right) that 
are detected by rGMAP at prometaphase. j, Box plots showing the sizes of 
domains initially detected at prometaphase (n =1,105), ana/telophase 
(n=1,124), early G1(n=2,385) and mid G1 (n=520) by DI+sweep (directionality 
index + window size adjustment). For all box plots, centre lines denote medians; 
box limits denote 25th-7Sth percentile; whiskers denote 5th-95th percentile. 
Pvalues were calculated by two-sided Mann-Whitney U-test. k-m, Similar to 
g-i, showing analyses based on domains called by DI+sweep. 
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Extended Data Fig. 5 | See next page for caption. 


Extended Data Fig. 5 | CTCF and cohesin chromatin occupancy in mitosis 
and Glentry.a, A density heat map of CTCF ChIP-seq data of each biological 
replicate of asynchronous and prometaphase samples, centred around IM- and 
10-CTCF binding sites. b, A density heat map of Rad21ChIP-seq data of both 
biological replicates of asynchronous and prometaphase samples centred 
around all Rad21 peaks. c, Genome browser tracks showing CTCF and Rad21 
ChIP-seq signals of asynchronous and prometaphase samples at indicated 
regions. n=2-3 biological replicates. d, ChIP-qPCR data of CTCF and Rad21in 
asynchronous (n=3, 6 biological replicates for CTCF and Rad21, respectively) 
and prometaphase samples (n =4, 3 biological replicates for CTCF and Rad21, 
respectively). Data are mean+s.e.m.e, Motif enrichment analysis of IM- and IO- 
CTCF binding sites with indicated F values as determined by MEME-ChIP. f, Top, 
donut charts showing the genome-wide distribution of IM- and IO-CTCF 
binding sites. Middle, bar graphs showing the percentage of IM- or IO-CTCF- 
binding sites that are found in indicated numbers of tissues. Bottom, donut pie 
chart showing the fraction of IM- and 1O-CTCF binding sites that are co- 
occupied by Rad21.g, Density heat maps and meta-region plots of CTCF and 


Rad21ChIP-seq data across all time points centred around CTCF-specific and 
CTCF/Rad21co-occupied binding sites. h, Bin plots showing ChIP-seq signals 
of CTCF and Rad21 peaks for each stage of the cell cycle (y axis) against late G1 
(xaxis).i, ChIP-qPCR of CTCF and Rad21at indicated binding sites over time. 
n=2biological replicates for 0 and 25 min, and n=3 biological replicates for 
120 and 240 min after nocodazole release. Data are mean +s.e.m.j, Schematic 
showing mCherry-tagging of endogenous CTCF and SMC3. k, Representative 
images (from at least 10 dividing cells) illustrating the behaviour of mCherry- 
tagged CTCF and SMC3 during mitosis—early G1 phase progression. Similar 
observations were made in 2 independent experiments. Yellow dotted circles 
demarcate cell nuclei after mitosis. Scale bar, 5 um.1, Average recovery curve 
of mCherry-tagged CTCF and SMC3 that co-localize with H2B-YFP. Cells 

(11 mother cells/22 daughter cells and 10 mother cells/18 daughter cells) were 
analysed for CTCF and SMC3, respectively. Pvalues were calculated using a 
two-sided Student’s t-test. Dataare mean+s.e.m. Pvalues were omitted at time 
points with fewer than 5S cells. 


Article 


a b c prometa ana/telo early G1 
APA signal at: , 
: r=0.8212  ¢ r= 0.8334 r=0.8922 
13,317 prometa ana/telo early G1 mid G1 late G1 2 rd ¥ 3 7 
10% Zz 1024 / 10 
non-redundant loops Qo | as Zé re. / 
On N 
15,000 ca ri @ | 104 10"4 101 Yi 
Blet o 
o 2 “ 
mo] a 0. ef Pe 
y Zilro 7 = a | 10! 109+ " 0° a i 
eo: 2\oF 2 0 1 2 0 1 2 1 2 
§ 10,000 3 |S8 vt a | ws 10° 101 10 10° 101 10 
% B\8e 2 mid G1 late G1 
5 2/_-+ ? 8 
= 500 #/55 (ee ie R r= 0.8861 r= 0.8822 
2 © luo 2) 402] i 1024 a 
o|/ed Q ‘a 
3 8 
&|- TF | 009 .31 = id 1 
0 68]. tot PA 19 v4 
5® 00M NON Qo i | Z ” 
S& we” Be 100 Er rrree rrr rns 100 rs orne 
LEP EX -65kb  65kb 10° 101 102 10° 101 10? 
d loop read counts replicate 1 
replicates merged replicate 1 replicate 2 
prometa ana/telo earlyGi  midG1 late G1 prometa  ana/telo earlyGi midG1_ lateG1 prometa ana/telo earlyGi midGi _ lateG1 
= — - 7 
oS52 . i 
B38 | 
227 50 
oa Se Se a a ee ee 
CTCF 10) " we ' id 
Rad21 4 
vt 
os2 
Bao 
332 20 
ob lo 
CTCF 10 ar 
Rad21 8 
B 
Bae 
ggs 
38 
CTCF10 = : : 
Rad21 § 
» ° » ° » ° »» ° »> ° 
e f 
Rad21 enrichment 
down-stream up-stream g g 
pake S01 r. s P<2.20e-16 P<2.26-16 S x 
Be 28s, feed Fees fected & loop 1 R=01) 249. —~ 177 R=08 & 
#8 egs 0.01 x 7 3 _ a” | -| esloont 
3 g 3 S le r. Ler | =f loop strength & H =4 loop strength 
zs° g L ome | © prometa § é } 5 © prometa 
<5 ot oe es (edad (ema! on eanatelo %|| e%-«* eueal ss 0.5: 5 * ‘@ analtelo 
e x ° =0. Ss H R=0.7 
gg 8 —— a o eat Gt 8] | toop2 8 | |e loope ¢ early Gt 
2s 23 [ e mid G1 | loop strength 8 0.0 | 3 | loop strength © mid G1 
os elateGi i 5 i % : e@ late G1 
<3 | a ’ 8 i a ’ 
3 = & | = 
<§% or (adie O}| 2% o | ) « 
u R=04) 2-05 R=09 5 oe 
Fs | 1 opi Ee Ep Q|] 6 
ge = = 5 i gS §8 S| L¢_loopn 
g = 0.21 0.51 4 2 loop strength £ g @ z © oop strength 
es ad 5 gy § 
* — ata = | @ loop strength vs. CTCF ChIP-seq peak strength 
prometa anaftelo early G1 mid G1 B loop strength vs. Rad21 ChIP-seq peak strength 
g \ initia dcteeisdiae h : size matched comparison 
loops initially detected at: 1.25 4:30 
Q ® 1.00 ooh meeeAs. 
je 1.00 +---------, a. 
> > 0.80 
2 ofS —@>- structural loops S 960 —@>- structural loops > a 
ae 0.50 
& a —@- tanchor 6 -occupied by 8 0.40 CTCFicohesin independent E/P loops im 
Q 8 0.25 CTCFicohesin 8 0.20 —@-— (0or1 anchor co-occupied by 
8 = —@- O anchor 2” CTCFicohesin) ; P 
S 0.00 : T r 1 0.00 - - - 1 0.0 2.5 x10! 5.0 x10! 
2 0 60 120 180 240 0) 60 120 180 240 loop sizes (bp) 
8 time after nocodazole release (min) time after nocodazole release (min) 
j 
loops initially called at: 
late G1 @ E/P loops 
mid G1 structural 
early G1 loops 
ana/telo B others 


T T 
0.5 1.0 
fraction of loops 


\ 
0.0 
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Extended Data Fig. 6 | Loop statistics and k-means clustering on structural 
loops. a, Bar graph showing the number of loop calls at each stage of the cell 
cycle. b, Aggregated peak analysis (APA) of loops initially detected at each 
stage of the cell cycle. Bin size, 10 kb. Numbers indicate average loop strength: 
In(obs/exp).c, Scatter plots showing the Pearson correlation of loop strength 
(read counts) between two biological replicates. d, Hi-C contact maps showing 
representative regions that contain cluster 1 (chr1: 172.8 Mb-173 Mb), 2 (chr1: 
90.2 Mb-90.8 Mb) and 3 (chr2: 47.5 Mb-49 Mb) structural loops in merged and 
both biological replicates. Bin size, 10 kb. Loop signal enrichment is indicated 
by black arrows. Contact maps are coupled with genome browser tracks 
showing CTCF and cohesin occupancy across all stages of the cell cycle. 
Chevron arrows mark orientations of CTCF sites at loop anchors. e, APA of 
cluster 1, 2 and 3 structural loops across all stages of the cell cycle. Each heat 
map is coupled with four meta-region plots corresponding to CTCF and Rad21 
ChIP-seq signals centred around either upstream or downstream loop 
anchors. Bin size, 10 kb. Numbers indicate average loop strength: In(obs/exp). 
f, Left and right, schematics showing how correlations are computed between 
CTCF or Rad21and loop strength over time. Middle, box plot showing the 
Pearson correlation coefficients between CTCF or Rad21 ChIP-seq peak 
strength at upstream or downstream anchors and structural loop strength 
over time (n=4,712). For all box plots, centre lines denote medians; box limits 
denote 25th-7Sth percentile; whiskers denote 5th—-9Sth percentile. Pvalues 


were calculated using a two-sided Wilcoxon signed-rank test. g, Box plot 
showing sizes of structural loops initially detected at ana/telophase (n=90), 
early G1 (n=2,233), mid G1 (n=1,595) and late G1 (n= 793). For all box plots, 
centre lines denote medians; box limits denote 25th-75th percentile; whiskers 
denote 5th-95th percentile. Pvalues were calculated using atwo-sided Mann- 
Whitney U-test. h, Average recovery curves of structural loops (n=4,241) and 
E/P loops with 0 (n= 678) or 1(n=1,338) anchor co-occupied by CTCF/cohesin. 
The 10% of loops with the smallest increment from prometaphase to late G1 
were filtered out from the analysis. Data are mean + 99% confidence interval. 
“oe and ####H, P<2.2x 107° (structural loops compared with E/P loops with O or 
lanchor co-occupied by CTCF/cohesin, respectively). Two-sided Mann- 
Whitney U-test. i, Left, average recovery curves of randomly sampled and size- 
matched structural loops and CTCF/cohesin independent E/P loops (n=2,869 
for both groups). The 10% of loops with the smallest increment from 
prometaphase to late Gl were filtered out from the analysis. Data are mean + 
99% confidence interval. Pvalues were calculated using a two-sided Mann- 
Whitney U-test. Right, box plot showing the comparable size distribution of 
these two randomly sampled groups (n=2,869 for both). For both box plots, 
centre lines denote medians; box limits denote 25th-75th percentile; whiskers 
denote 5th-95Sth percentile.j, Bar graphs depicting the composition of loops 
newly called at each stage of the cell cycle. 
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Extended Data Fig. 7 | Reformation of chromatin stripes after mitosis. a, Pie does not have inwardly oriented CTCF at the stripe anchor. d, Left, aggregated 


chart showing the fraction of stripes with inwardly oriented CTCF at stripe Hi-C contact maps that compile all stripes with inwardly oriented CTCF to show 
anchors. b, Hi-C contact maps of two representative regions (chr2: 12.75 Mb- their overall dynamic growth after mitosis. Right, box plots showing the 

14.75 Mb and chr1: 130.5 Mb-132.5 Mb) that contain stripes with inwardly lengths of these stripes at ana/telophase (n = 235), early G1 (n=1,472), midG1 
oriented CTCF. Bin size, 10 kb. Contact maps are coupled with genome browser (n=1,477) and late G1 (n=1,473). For all box plots, centre lines denote medians; 
tracks of CTCF and Rad21across all stages of the cell cycle and tracks of box limits denote 25th-7Sth percentile; whiskers denote 5th—95th percentile. 
asynchronous H3K4me3, H3K4mel and H3K27ac and annotation of cis- Pvalues were calculated using a two-sided Mann-Whitney U-test. e, Similar to 
regulatory elements. Chevronarrows mark positions and orientationsofCTCF d,showing stripes without inwardly oriented CTCF. n= 72, 281, 277, 272 for ana/ 
peaks at stripe and loop anchors. Lengthening of stripes is indicated by black telophase, early G1, mid Gland late G1, respectively. f, H3K27ac ChIP-seq 
arrows. Stripe anchors are indicated by purple arrows. Loops along the stripe profile from asynchronous GIE-ER4 cells is plotted -200 kb to 2 Mb around the 


axis and at the far end of stripes are indicated by blue circles.c, similartob,Hi-C horizontal stripe anchors and —-2 Mb to 200 kb around the vertical stripe 
contact maps showing a representative stripe (chr10: 118.2 Mb-118.8 Mb) that anchors. Anchor position is indicated by purple arrows. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Supplementary E/P loop analyses. a, APA of the three 
clusters of E/P loops on merged and two biological replicates. Bin size, 10 kb. 
Numbers indicate average loop strength: In(obs/exp). b, Hi-C contact maps 
showing an additional example of cluster 1E/P loop (chr1: 43.45 Mb-43.65 Mb, 
green arrow). Binsize, 10 kb. Colour bar denotes g-normed reads. Contact 
maps are coupled with genome browser tracks of CTCF and cohesin across all 
time points as well as asynchronous H3K4me3, H3K4mel and H3K27ac and 
annotations of cis-regulatory elements. c, Similar to b, showing two examples 
of manually identified transient E/P contacts (Pde12 locus and Morc3 locus, 
indicated by red arrow). Boundaries or structural loop anchors that potentially 
interfere with these E/P contacts are indicated by black and blue arrows, 
respectively. Contact maps are coupled with tracks of Capture-C interaction 
profiles. Probes (anchor symbol) are located at promoters of Pde12and Morc3 
genes. d, Hi-C contact maps showing the Pde12 locus on two biological 
replicates. Bin size, 10 kb. e, Quantification of the Capture-C read density of the 
red regionsinc.n=3 biological replicates. Data are mean +s.e.m. Pvalues were 
calculated from two-sided Student’s ¢-test. f, Similar to d, Hi-C contact maps 
showing the cluster3 E/P loop (red arrows) at Commd3 locus in two biological 
replicates. Potential interfering loop is indicated by blue arrows. g, Insulation 
score profiles centred around the boundaries and interfering structural loop 
anchors that solely reside within cluster 1, 2 or 3 E/P loops. h, Sanger 


sequencing profiles showing deletion of the CTCF core motif at the upstream 
anchor of the structural loop (blue arrows inf) that potentially interfere with 
the cluster3 E/P loop at the Commd3 locus (red arrows inf). i, ChIP-qPCR 
showing the abrogation of CTCF and Rad21 binding at the edited site inf.n=3 
biological replicates. Data are mean +s.e.m. Pvalues were calculated by two- 
sided Student’s t-test. j, Schematic showing potential behaviour of cluster 3 E/P 
loops before and after deletion of the interfering structural loop anchor. 

k, Capture-C interaction profiles between Commd3 promoter and downstream 
cis-regulatory element (red bars) on wild-type and interfering anchor-deleted 
mutant cells over time. The location of the capture probe is indicated by the 
anchor symbol. The deleted CTCF site is indicated by green triangles. 
Formation of the transient loop is indicated by red arches. I, Quantification 
showing read density of the red regions ink. n=3 and 2 biological replicates for 
wild-type and mutant cells, respectively. Data are mean+s.e.m. Pvalues were 
calculated by two-sided Student’s t-test. m, Box plots showing ChIP-seq signals 
of indicated histone modifications at anchors that solely participate in cluster 
1,20r3 (transient) E/P loops (n=2,612, 1,338 and 413 respectively). For all box 
plots, centre lines denote medians; box limits denote 25th-75th percentile; 
whiskers denote 5th-95th percentile. Pvalues were calculated using a two- 
sided Mann-Whitney U-test. 
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Extended Data Fig. 9 | Relationship between post-mitotic structural 
organization and gene reactivation. a, Meta-region analysis of Pol Il 
occupancy of active genes across all stages of the cell cycle. TSS, transcription 
start site; TES, transcription end site. b, Bin plots showing the positive 
correlation between Pol II ChIP-seq signal strength and eigenvector 1 
(asynchronous GIE-ER4 cells”’, 25-kb binned) genome-wide. c, Left, schematic 
showing genes that are within early or late domains. Right, average Pol II 
occupancy of genes that reside in prometaphase (n = 2,274 genes) ana/ 
telophase (n=2,114 genes), early G1(n=1,159 genes) and mid G1 (n=303 genes) 
emerging domains. Data are mean + 99% confidence interval. d, Heat map 
showing gene-body Pol II occupancy across all stages of the cell cycle. Genes 
are ranked by their PC1 values (‘spikiness’).e, Genome browser tracks showing 


27. Hsu, S.C. et al. The BET protein BRD2 cooperates with CTCF to enforce transcriptional and 
architectural boundaries. Mol. Cell 66, 102-116 (2017). 


representative examples of early spiking (Kpna2) and gradually activating 
(Nedd4) genes. f, Quantification of gene-body Pol Il occupancy ine.n=2 
biological replicates for 0h, andn=3 biological replicates for other time 
points. Dataare mean +s.e.m.g, Schematic showing the stratification of genes 
onthe basis of their involvement in E/P loops. h, Table showing the number of 
genes that are solely involved in clusters of E/P loops. i, Average gene-body Pol 
Il occupancy of the genes inh over time. Sample sizes are showninh. Dataare 
mean+s.e.m.j, Box plots showing the spikiness (PC1) of genesinh. Sample 
sizes are showninh. For all box plots, centre lines denote medians; box limits 
denote 25th-75th percentile; whiskers denote 5th-95th percentile. Pvalues 
were calculated using atwo-sided Mann-Whitney U-test. 
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For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
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A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 
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AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 
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For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


[ ] Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 
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Data collection For flow data collection, we used FACSDiva 8 (BD Biosciences) and Summit (Beckman Coulter) 
For image collection, we used MetaMorph 
For high-throughput sequencing data collection, we used NextSeq Control Software v2.2.0. 


Data analysis For statistical analyses, we used R (R studio) or GraphPad Prism 7. 

For imaging processing, we used FiJi (Image J 2.0.0). 

For flow chart generation, we used FlowJo 10.4.0. 

For high throughput sequencing data processing and subsequent data analyses we used: 
pandas 0.22.0, 

scipy 0.19.1 
numpy 1.13.3, 

HiC-Pro_2.7.7, 
juicer_tools_0.7.5.jar, 0.7.0 jar 
FastQC 0.11.5, 

trim galore 0.4.1-0, 

Cutadapt 1.18, 

FLASH 1.2.8, 

CCAnalyzer3, 

bowtie 2 

bowtie 0.12.7, 

SAMtools v0.1.19, 

macs2 v2.1.0, 2.1.1, 

kent UCSC Utilities, 

bwtool 1.0, 

HOMER 4.9.1, 


BEDtools,2.27.1 

deeptools 2.5.4 

For loop and stripe identification and Dl+sweep. See method section (code is available upon request to the corresponding authors) 
3D netmod domain calling: https://bitbucket.org/creminslab/3dnetmod_method_v3.0_development 

rGMAP 1.4 
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We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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- Adescription of any restrictions on data availability 


All figures include publicly available data. All ChIP-seq, Capture-C and Hi-C raw and processed data generated from this study are now deposited into the GEO 
database with accession number GSE129997 for public access. ChIP-seq files of histone modifications shown in figure 4 and Extended data figures 7 and 8 are 
available from the GEO database with accession numbers: H3K27ac (GSE61349), H3K4me1 (GSM946535), H3K4me3 (GSM946533), H3K36me3 (GSM946529) and 
H3K9me3 (GSM946542) 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


DX] Life sciences [_] Behavioural & social sciences [| Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size Sample size was not pre-determined. We used sample sizes commonly accepted for high throughput genome wide experiments. We 
performed 2 biological replicates for Hi-C, 2-3 biological replicates for ChIP-seq and ChIP-qPCR, and 2-3 biological replicates for Capture-C. Hi- 
C and ChIP-seq data were pooled for down-stream analyses. 


Data exclusions | Experiments were done in multiple replicates. Replicates with technical failure were removed. 
Replication 2 biological replicates for Hi-C and at least 2-3 biological replicates for ChIP-seq and capture-C were generated. 
Randomization Experiments were not randomized. No animal or human subjects were involved in this study. 


Blinding Researchers were not blind to group allocation. Blinding was not relevant to our study as no human subjects were involved. 
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Validation 


Eukaryotic cell lines 
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Genome browser session 
(e.g. UCSC) 


Methodology 


Replicates 


Sequencing depth 


Antibodies 


Peak calling parameters 
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https://genome.ucsc.edu/s/thomasgilgenast/mitosis-ctcf-rad21-pol2-chipseq 


See methods section. Briefly, 2-3 biological replicates were generated per time point per antibody. Specifically: 
For CTCF, we generated 2 biological replicates per time point except prometaphase, for which we performed 3 biological 
replicates. 


For Rad21, we performed 2 biological replicates per time point. 


For Pol Il, we performed 3 biological replicates per time point except prometaphase, for which we performed 2 biological 
replicates. 


See supplementary table 7 


anti-CTCF Millipore, 07-729 
anti-Rad21 Abcam, ab992 
anti-Pol II Cell Signaling, 14958 


Sequencing reads were mapped to the reference mouse genome mm9 using bowtie (0.12.7, "-m 2 --tryhard"). Reads were 
filtered to remove non-uniquely mapped reads and PCR duplicates using Samtools (vO.1.19) and converted to bed format 
using BEDtools (v2.27.1, "bedtools bamtobed"). 


For CTCF and Rad21, filtered reads from each biological replicate were pooled together and down-sampled to equivalent 
read counts across all cell cycle stages. Peaks were identified using the MACS2 with punctate calling for both CTCF and 
Rad21 (p-values 1e-8 and 1e-4 respectively), using each IP's cell cycle stage matched input as the control. 


For Pol Il, Peaks were then called with MACS2 for each replicate with a p-value cutoff of 1e-4, using each IP's cell cycle stage 
matched input as the control. 
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Data quality 1). Raw fastq files were assessed with FastQC (v0.11.5) prior to processing. 
). Peaks were called using input controls corresponding to every cell cycle stage. 
). Peaks were called using p-value cutoffs described above and the default above 5-fold enrichment. 
4). Correlation among replicates was assessed (Extended Data Fig. 1g-i). 
). Peaks of Rad21 and CTCF were largely overlapping and motif analysis revealed the expected CTCF motif within peaks. 
) 


. Peaks of Pol Il were largely ocated at expected genomic regions (TSS, genebodies). 


Software For ChIP-seq data processing and analyses, we used bowtie version 0.12.7, Samtools vO.1.19 macs2 2.1.1, 2.1.0, kent UCSC 
Utilities, bwtool version 1.0, HOMER version 4.9.1, BEDtools and deeptools 2.5.4 


Flow Cytometry 


Plots 


Confirm that: 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 
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The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Methodology 

Sample preparation See method section. 
Briefly, actively proliferating G1E-ER4 cells were synchronized at pro-metaphase with nocodazole treatment. Cells were then 
released from nocodazole for several durations (Oh, 25min, 1h, 2h and 4h). Cells were harvested, washed with PBS and 
crosslinked with 1% PFA for 10min. Crosslinks were quenched with glycine for 5min and cells were then penetrated with 0.1% 
Triton X-100. Finally, cells were stained with 20ng/ml DAPI and subjected to cell sorting. 

Instrument Beckman Coulter Moflo Astrios sorter/Becton Dickinson FACSAria Fusion sorter 

Software Flow charts were generated using FlowJo 10.4.0 


Cell population abundance — We achieved very high cell purity of pro-metaphase populations. The purity of pro-metaphase cells was >98%. This was 
confirmed by DAPI staining and the disassembly of lamin B1. 


Gating strategy Prometaphase cells were gated on mcherry (high), DAPI (4N) and pMPM2 (high) fluorescent signal. ana/telophase cells were 
gated based on DAPI (4N) and mcherry (low, relative to prometaphase) signals. G1 samples were gated based on DAPI (2N) and 
mcherry (low) signals. 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Calcium homeostasis modulators (CALHMs) are voltage-gated, Ca’*-inhibited 
nonselective ion channels that act as major ATP release channels, and have important 


roles in gustatory signalling and neuronal toxicity’ *. Dysfunction of CALHMs has 
previously been linked to neurological disorders’. Here we present cryo-electron 
microscopy structures of the human CALHM2 channel in the Ca”"-free active or open 
state and in the ruthenium red (RUR)-bound inhibited state, at resolutions up to 2.7 A. 
Our work shows that purified CALHM2 channels form both gap junctions and 
undecameric hemichannels. The protomer shows a mirrored arrangement of the 
transmembrane domains (helices S1-S4) relative to other channels witha similar 
topology, suchas connexins, innexins and volume-regulated anion channels* ®. Upon 
binding to RUR, we observed a contracted pore with notable conformational changes 
of the pore-lining helix S1, which swings nearly 60° towards the pore axis froma 
vertical to alifted position. We propose a two-section gating mechanism in which the 
S1 helix coarsely adjusts, and the N-terminal helix fine-tunes, the pore size. We 
identified a RUR-binding site near helix S1 that may stabilize this helix in the lifted 
conformation, giving rise to channel inhibition. Our work elaborates on the principles 
of CALHM2 channelarchitecture and symmetry, and the mechanism that underlies 


channel inhibition. 


ATP release channels have a fundamental role in many neurological 
functions—including the modulation of excitatory synaptic strength, 
long-term synaptic potentiation and neuronal excitability—by mediat- 
ing the purinergic signalling pathway in the central nervous system”. 
CALHMs act as one of the major ATP release channels, together with 
maxi-anion channels, volume-regulated anion channels (VRACs), con- 
nexins and pannexins* ®", CALHMs are abundantly expressed in taste 
bud cells and have an important role in sensing sweet, bitter and umami 
flavours’. They are activated by a reduction in extracellular calcium 
and membrane depolarization, which triggers a signalling cascade in 
the neural gustatory pathways”. CALHMsalso have an important role 
in cortical neuron excitability”. Dysregulation of CALHMI, as well as 
its P86L polymorphism, have previously been linked to neurological 
disorders such as Alzheimer’s disease and ischaemic brain damage’. 

CALHMsare predicted to have four transmembrane helices, similar to 
connexins, innexins, pannexins and VRACs. However, the architecture, 
symmetry and domain arrangement of CALHMs remain unknown. 
Connexins and VRACs are hexamers*” and innexins are octamers°; 
the current concept is that CALHM1 and CALHM3 form hexamers and 
that they do not form gap junctions, owing to N-glycosylation in the 
extracellular loop". 

In addition to Ca", CALHMs can be modulated by a variety of small 
molecules that includes RUR, Gd* and 2-aminoethoxydiphenyl borate 
(2-APB)”?"*"!”, RURis a hexavalent polysaccharide stain’® that non-spe- 
cifically inhibits many ion channels” through an unknown molecular 
mechanism. 


Structural determination 


Our whole-cell electrophysiology data showed that human CALHM2 
produces a robust current in the absence of Ca”*. The current showed 
no obvious voltage dependence but was inhibited by Ca”* or RURina 
voltage-dependent manner (Extended Data Fig. 1a, c-f). To elucidate 
the assembly of CALHM2 and to understand the molecular mechanism 
that underlies channel gating, we studied the human CALHM2 chan- 
nel inthe presence of EDTA or the antagonist RUR using cryo-electron 
microscopy (cryo-EM). 

The initial cryo-EM experiments using GFP-tagged constructs yielded 
solely hemichannel particles. The two-dimensional classes show a 
fuzzy tail, which is probably the GFP tag, and the three-dimensional 
reconstructions were of low resolution (Extended Data Fig. 2a, c, d). 
The GFP-cleaved construct not only improved the cryo-EM map but 
also showed both hemichannel and gap-junction particles at a ratio of 
approximately 1:1 at grid concentration (about 18 1M), representing the 
approximate dissociation constant (K,) of the docking of hemichannels 
(Extended Data Figs. 2b, 3). Unlike the CALHMI1 channel, CALHM2 did 
not show N-glycosylation (Extended Data Fig. 2e, f), which probably 
explains the existence of a gap junction. Only hemichannels, and not 
gap junctions, were observed in the presence of RUR (Extended Data 
Fig. 4). These observations indicate that both the C-terminal GFP tag 
and the binding of RUR may affect the conformation of the docking site. 

We determined the structures of CALHM2 in the presence of 
EDTA as hemichannels and gap junctions (EDTA-CALHM2"™ and 
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Fig. 1| The overall architecture of CALHM2. Odd-numbered subunits are in 
blue (for RUR-CALHM2) or red (for EDTA-CALHM2"™™), and even-numbered 
subunits are in white. The eleventh subunit isin green, lipid-like densities arein 
purple and RUR densities are in orange. a, Selected two-dimensional class 
averages of RUR-CALHM2.b,c, The three-dimensional reconstruction of 
RUR-CALHM2, viewed parallel to the membrane (b) and from extracellular side 


EDTA-CALHM2®*, respectively) and in the presence of RUR (RUR- 
CALHM2), at resolutions of 3.3, 3.5 and 2.7 A, respectively (Extended 
Data Figs. 3-6, Supplementary Table 1). Our structures unambigu- 
ously showed that CALHM2 is an undecamer (Fig. la—g). Moreover, 
we observed 11 strong cylinder-shaped densities in the RUR-CALHM2 
structure, which represent RUR molecules (Fig. Ic, e, Extended 
Data Fig. 4). 

The S1 helix and the N-terminal helix (NTH) were poorly defined 
relative to the rest of the channel, indicating high flexibility. To assess 
the conformational heterogeneity of these helices, we applied an 
approach that combined symmetry expansion and signal subtrac- 
tion, in which all of the subunits were subtracted and classified” 
(Extended Data Figs. 3, 4). In RUR-CALHM2, two distinct conformations 
of helix Slappeared, with 70% in the lifted conformation and 13% in the 
vertical conformation; the rest were not well-defined. RUR densi- 
ties were observed only in the class with lifted helix S1. By contrast, 
EDTA-CALHM2"*™ contained approximately half of the helix S1 
ina vertical conformation, and the rest was invisible. There are non- 
resolvable densities in the pore vestibule that may represent the inter- 
mediate states of helix Sl between the lifted and vertical conformations. 
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ofthe membrane (c). SS bonds, disulfide bonds. d-g, The structures of 
RUR-CALHM2 (d, e) and EDTA-CALHM2"*™ (f, g). h, Selected two-dimensional 
class averages of EDTA-CALHM2®"?. i,j, The three-dimensional reconstruction 
(i) and atomic model (j) of EDTA-CALHM2*?. Unsharpened reconstructions 
are shownas transparent envelopes. 


We use these two extreme conformations—RUR-CALHM2 with all 
of helix S1 in the lifted conformation and EDTA-CALHM2 with all 
S1in the vertical conformation—to discuss the gating mechanism 
and the action of the antagonist RUR (Fig. Ic, e, g). Owing to intrinsic 
positional uncertainty, we fitted only the protein backbone into the 
densities of the S1 helix and NTH, and the exact position of residues 
in this region (residues 13-40) should be interpreted with caution. 
Nevertheless, we provide electrophysiology data that validate the 
placement of helix $1 (discussed in the section ‘Subunit structure 
and RUR-binding site’). A part of the extracellular loops is also poorly 
defined (residues 138-152). 


Overall architecture 


The CALHM2structure assembles as an undecamer with each protomer 
consisting ofa large N-terminal transmembrane domain (TMD) (which 
comprises helices S1, S2, S3 and S4, and NTH), an intracellular C-ter- 
minal domain (CTD) and asmall extracellular linker region (Fig. 1b-g). 
The hemichannels show an overall truncated cone shape that con- 
sists of helices S2, S3 and S4 as a side wall, the CTD as a base and the 


$1-S2 


BS linker 


Fig. 2| Asingle subunit of CALHM2, and RUR-binding site. a—-c, Cartoon 
representation of EDTA-CALHM2"*™ (a, b) and RUR-CALHM2(c). The $4 helix 
in band cis shownasatransparent tube for clarity. d, Domain organization of 
EDTA-CALHM2'*™, e, f, TMD organization of EDTA-CALHM2"*™ (e) and 
connexin 43 (f) viewed from the extracellular side. g, The loose contacts 
between the S1and S3 helices in EDTA-CALHM2"™™. Distances (in A) between 
Ca of adjacent residues in helices Sland S3 are labelled. h, i, RUR-binding site. 


extracellular linker region as a rim, with the intracellular side wider 
than the extracellular. The first long helix in the CTD (CH1) intercrosses 
neighbouring subunits, and thus weaves CALHM2 into a sturdy unde- 
camer witha diameter that is 1.6- and 1.2-fold larger than those of con- 
nexins and innexins, respectively (Extended Data Fig. 7). 

When viewed from the extracellular side, the EDTA-CALHM2"*™" 
structure contains an unusually large pore that is lined by helix S1, 
whichis poised parallel to the pore axis; we thus term this conformation 
‘vertical SI’ (Fig. 1g). By comparison, RUR-CALHM2 shows a vertical 
compression and a horizontal expansion (Fig. 1d-g). Most notably, 
the pore in RUR-CALHM2 is markedly smaller: helix SI swings towards 
the pore axis in what we term the ‘lifted S1’ conformation (Fig. 1c, e). 
A RUR density was observed underneath helix S1, and probably sup- 
ports this helix in the lifted conformation—thus stabilizing the channel 
in an inhibited state. This agrees with previous functional studies that 
have shown that RUR inhibits CALHMs”””"* ” (Extended Data Fig. la, 
c). Notably, the intracellular halves of helices S3 and S4 are relatively 
far apart, which creates a gap in which a lipid-like density is observed 
(Fig. 1b). Sucha loose contact between the S3 and S4 helices might facili- 
tate the conformational change of $3 during channel gating because 
the pore-lining S1is attached to S3. 

Similar to connexins and innexins, EDTA-CALHM2?2" is docked by 
two hemichannels ina head-to-head manner, forming a thick cylinder 
with a large diameter (Fig. 1h-j). The docking region of EDTA-CALH- 
M2°*° is considerably shorter than the gap junctions of connexin and 
innexin, owing toa smaller extracellular linker region (Extended Data 
Fig. 7a, c, d). Two disulfide bonds connect the $3-S4 linker and S1-S2 
linker, which may have an important role in stabilizing the docking of 
two hemichannels” (Fig. 1d). Disulfide bonds are present in similar 
positions in connexins and innexins*”*”, Finally, we observed two 
strong lipid-like densities close to the extracellular linker region, 
which probably contribute to maintaining the integrity of the docking 
site (Fig. 1b, i). 


Subunit structure and RUR-binding site 

The protomer of CALHM2is L-shaped: two long and straight helices— 
helix S4 in the TMD and CH1 in the CTD-—run nearly perpendicular to 
each other (Fig. 2a—c). Both the S2 and S3 helices are multi-segment 
helices. The S2a and S3b segments forma plane together with helix $4, 
whichruns parallel to the pore axis; the S2b and S3a segments reside on 
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Fig. 3 | Inter- and intrasubunit interactions. a, A view of the intersubunit 
interface of EDTA-CALHM2"™™ at the TMD (vertical rectangle; shown inb, c) 
and at the CTD (horizontal rectangle). b,c, The intersubunit interface at the 
TMD between helix S2 and the adjacent S4, viewed in parallel to the membrane 
(b) or from the intracellular side (c). d, Interface at the CTD viewed from the 
intracellular side. Black filled circles indicate the four intersubunit interfaces 
of CH1.e, Enlargement of the large circle ind showing the interactions of three 
neighbouring CHI helices. f, Enlargement of the large circle ind, showing the 
TMD-CTD interface and the exterior circular layer inthe CTD formed by CH3, 
CH4 and the CH3-CH4 linker. g, h, Thermostability test of wild-type CALHM2, 
the mutants of key residues involved in the CH1interactionsine, and the 
mutant witha deletion of 52 residues at the C-terminal end (AC52). Both F251A 
and H238A considerably decreased the thermostability of the protein, but the 
AC52 construct showed no effect. n=3, 3, 4 or 3 biologically independent 
experiments were performed for wild-type CALHM2, CALHM2(H238A), 
CALHM2(F251A) or CALHM2(ACS52), respectively. Data are mean+s.e.m. Each 
dot indicates the value of one single independent experiment. 


the top of the CTD, mediating the only contact between the TMD and 
the CTD (Fig. 2d). A comparison of the CALHM2 protomer with those 
of connexins, innexins and VRACs showed notable differences in size, 
shape and domain organization (Extended Data Fig. 7c, d). 

The most notable distinction between CALHM2 and other channels 
witha similar topology (suchas connexins, innexins and VRACs) is at the 
TMD. Viewed from the extracellular side, the SI, S2, S3 and S4 helices 
are arranged anticlockwise in EDTA-CALHM2"™™, with $2, S3 and S4 
approximately ina plane; by contrast, the arrangements in connexins, 
innexins and VRACs are all clockwise, with helices S2, S3 and S4 form- 
ing a compact helix bundle (Fig. 2e, f). As a consequence, the S1 helix 
in EDTA-CALHM2"™ js loosely attached only to the S3 (Fig. 2e, g), but 
inconnexins, innexins and VRACs, itis clamped in the helix bundle and 
forms extensive interactions with S2 and S4 (Fig. 2f). We suggest that 
sucha loose contact of helix S1to the rest of the TMD in CALHM2 gives 
this helix a high flexibility to swing up and down. In RUR-CALHM2, helix 
S1 was detached from S3, and a RUR molecule occupied the vertical S1 
position (Fig. 2c, h, i). Because RUR is positively charged, to validate 
the RUR-binding site and the placement of the lifted S1, we studied a 
charge-reversing mutant of E37 (a key residue on helix S1 that interacts 
with RUR). Although CALHM2(E37R) displays Ca?*-dependent gating 
similar to that of wild-type CALHM2, the inhibition of this mutant by 
RUR is nearly abolished (Extended Data Fig. 1b, c). Underneath the 
lifted S1, we also observed part of the NTH, which is involved in voltage- 
sensing in CALHMI, connexin, pannexin and innexin’* ?$ (Fig. 2c). 
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Fig. 4| RUR inhibition and ion-conducting pore. a, Superimposition of EDTA- 
CALHM2??™' (red) and RUR-CALHM2 (blue) using the 11 CH1 helices. Only two 
subunits are shown, for clarity. b,c, The shape of the ion-conducting porein 
EDTA-CALHM2'?™ (b) and in RUR-CALHM2(c). The smallest pore diameters 
were estimated without considering the involvement of the NTHs; they do not 
represent the real pore size and are used only to show the change of the pore 
profile. d, e, Superimposition of a single subunit of EDTA-CALHM2"™ (red) and 
RUR-CALHM2 (blue) using either the TMD (d) or the CTD (e). Regions with 
conformational changes are highlighted in colour and the rest is shownasa 
transparent cartoon, for clarity. f, Enlargement of the rectangle ine, showing 
conformational changes between EDTA-CALHM2"*™ (red) and RUR-CALHM2 
(blue) at the TMD-CTD interface. The CTD is shown as a transparent cartoon 
for clarity. 


The CTD in CALHM2 is the only component on the intracellular 
side and has an extended shape due to the long CHI, plus the three 
short a-helices (CH2, CH3 and CH4) (Fig. 2a—d). By comparison, 
the intracellular regions of innexins and VRACs are formed by 
both the CTD and the S2-S3 linker (Extended Data Fig. 7c, d). The 
extracellular side of CALHM2 consists mainly of a flat S3-S4 linker 
that is involved in the docking of two hemichannels (Fig. 2a). The S1-S2 
linker, on the other hand, is very short and not directly involved in 
the docking region. The longer S1-S2 linker in connexin and innexin 
protrudes into the extracellular space, forms an intact structure with 
the S3-S4 linker and participates directly in the docking (Extended 
Data Fig. 7c, d). 


Channel assembly 


The CALHM2 hemichannel exhibits acompact exterior through exten- 
sive subunit-subunit interactions with three major interfaces, one at 
the TMD between adjacent subunits (Fig. 3a—c), and the other two at 
the CTD (Fig. 3d-f). The extracellular linker region of CALHM2 lacks 
interactions, which presumably provides flexibility for the docking of 
two hemichannels into a gap junction (Fig. 3a). 

At the TMD layer, only the S2 and S4 helices of adjacent subunits 
interact with each other, mainly through hydrophobic interactions 
(Fig. 3b, c). By contrast, the TMD in connexins, innexins and VRACs 
forms a compact helix bundle, which results in different inter-subunit 
interfaces. Specifically, where in connexin the S1 and S2 helices of adja- 
cent subunits mediate the inter-subunit interface’, in innexin there is 
alack of interaction between adjacent TMDs°. 

Atthe CTD layer, each long CHI projects out to interact with 4 adja- 
cent CH1 helices (2 on each side), weaving the 11 subunits into a large 
circular frame (Fig. 3d). We identified a major interaction located in 
the middle of CHI, sandwiched by the N- and C-terminal regions of 
two adjacent CH1 helices. These three adjacent CH1 helices dovetail 
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Fig. 5|Schematic of the RUR-induced inhibition mechanism. Conformational 
changes are shown between EDTA-CALHM2"™™ (a) and RUR-CALHM2 (b). The 
observed movement of helix Sland the proposed movement of the NTH during 
channel inhibition are indicated. The region that cannot be modelled is 
indicated by dashed lines. 


to each other through aromatic and charged residues (Fig. 3e), which 
are conserved in the CALHM family and have crucial roles in channel 
assembly and stability (Fig. 3g, h, Extended Data Fig. 8). Moreover, 
CH3 and CH4—along with their linker—buckle into the space of the 
circular frame, forming an additional exterior circular layer (Fig. 3d, 
f). Such a complex interaction network in the CTD is absent in con- 
nexins, innexins and VRACs. The contact between the TMD and the 
CTD is mediated mainly by the CH3-CH4 linker with its cognate S2b 
and S3a segments, through the interaction between Y285 and H78 
(Fig. 3f). Notably, a P86L polymorphism in CALHMI that is linked with 
Alzheimer’s disease is located in the TMD-CTD interface (Extended 
Data Fig. 8). This polymorphism changes the functional properties 
of CALHMI1!*>”°, which suggests this interface has an important role 
in channel gating. 


Inhibition mechanism by RUR 


We compared the structures of EDTA-CALHM2"™ and RUR-CALHM2 
(Fig. 4, Supplementary Video 1). The 11 CHI helices of each struc- 
ture are well-aligned, which provides support for the role of CH1in 
constituting a rigid scaffold of the channel. By contrast, the TMD 
displays notable conformational changes in two aspects. First, RUR 
occupies the position of the vertical S1, and thus drives helix S1to swing 
nearly 60° towards the pore axis; this reduces the pore diameter by 
approximately 27 A (Fig. 4b, c). Second, there is obvious vertical com- 
pression and horizontal expansion, resulting in the entire TMD sliding 
towards the CTD and remodelling at the TMD-CTD interface (Fig. 4a). 

To further understand the action of RUR, we compared single subu- 
nits of EDTA-CALHM2"*™ and RUR-CALHM2 by superimposing their 
CTD or TMD (Fig. 4d, e). With the exceptions of helix Sland the TMD- 
CTD interface (Fig. 4d), the TMD displays a rigid-body inward tilting 
towards the pore axis upon binding of RUR, which explains the vertical 
compression of the channel (Fig. 4e). In EDTA-CALHM2"*"!, the vertical 
S1loosely attaches to helix S3, with the S3a segment sterically restrict- 
ing the CH3-CH4 linker onthe CTD. The CTD is further coupled tothe 
S2b segment through the interaction between Y285 on the CH3-CH4 
linker and H78 on the S2b, thus supporting the TMD in an elevated 
position (Fig. 4f). The binding of RUR detaches helix S1 from S3, and 
segment S3a swaps outward and releases its restriction on the CH3- 
CH4 linker. As a result, Y285 flips nearly 180° (Fig. 4f), rupturing the 
coupling between Y285 and H78 and leading the entire TMD to move 
towards the CTD. 

To define the functional states of EDTA-CALHM2"*™ and RUR- 
CALHM2, we inspected their ion-conducting pores. The pore diameters 
of EDTA-CALHM2 and RUR-CALHM2at two extreme conditions, with 
all S1in the vertical or lifted conformation, were estimated to be 50 
and 23 A (respectively) without considering the 11 copies of the NTH 
(Fig. 4b, c). Although the NTHs are disordered in the high-resolution 
structures refined using C,, symmetry, we observed prominent S1 
helix and NTH densities restricting the pore in one of the asymmetric 


RUR-CALHM2 classes (Extended Data Fig. 4), implying that the real pore 
sizes should be smaller. Indeed, truncation of the first 20 N-terminal 
residues (CALHM2(AN20)) showed a notable reduction in inhibition by 
RUR, which provides support for arole of the NTH in channel gating by 
physically restricting the pore; such a role is consistent with other 
channels with a similar topology (including connexins*’, innex- 
ins and VRACs**). Moreover, a charge-neutralizing RIOA mutant 
(CALHM2(R10A)) inthe NTH showed a marked reduction in inhibition 
by RUR at negative membrane potentials, indicating the involvement 
ofthe NTH in the voltage-dependent inhibition by RUR (Extended Data 
Figs. la, d-f, 9). 

We suggest that the EDTA-CALHM2"™ structure represents an active 
or open state, and the RUR-CALHM2 structure represents an inhibited 
state. The RUR functions as an antagonist instead of a pore blocker, 
based on its binding site. Despite being a nonspecific inhibitor for 
many ion channels”, this is—to our knowledge—the first time that 
the molecular mechanism that underlies inhibition by RUR has been 
elucidated. Relative to wild-type CALHM2, Ca”*-dependent inhibition 
remained unaffected for CALHM2(AN20) and CALHM2(R10A), implying 
that RUR and Ca” may not share the same binding site and that they 
may inhibit CALHM2 through different mechanisms (Extended Data 
Figs. la, d-f, 9). 


CALHM2 gapjunction 


The EDTA-CALHM2 gap junction is docked by two hemichannels 
through the extracellular S3-S4 linker (Extended Data Fig. 10a). In 
contrast to the hemichannel, the pore in EDTA-CALHM2?2" is smaller, 
and helix Slis ina lifted conformation similar to that in RUR-CALHM2 
(Extended Data Fig. 10b). The S1 helix is less well-defined than those in 
RUR-CALHM2 because it lacks the stabilization of the RUR molecule 
underneath. The docking of two hemichannels results in a marked 
conformational rearrangement at the junction, in which a flat loop 
in the S3-S4 linker is reformed into a triangle shape. As a result, two 
interfaces are created at the junction (Extended Data Fig. 10c, d), a 
primary interface between the paired subunits of two hemichannels 
anda minor interface between the diagonal subunits. The deletion ofa 
segment (A143-146) in the S3-S4 linker hindered CALHM2 in forming 
a gap junction (Extended Data Fig. 10d, e). 

Further investigation is required to reveal whether EDTA-CALHM2®*? 
is in an inhibited state similar to that of RUR-CALHM2, and how the 
two hemichannels within a gap junction coordinate with each other. 
In addition, despite the observation of a CALHM2 gap junction in vitro, 
its existence in vivo remains to be determined. 


Conclusion 


Our CALHM2 structures showed the undecameric assembly of aCALHM 
family member, and describe a molecular mechanism that underlies 
inhibitory gating induced by the antagonist RUR (Fig. 5). The primary 
determinants for the channel gate of CALHM2are the S1 helix and NTH. 
The Ca?*-free hemichannel favours helix S1in a vertical conformation 
and loosely attached to S3, resulting in a large open pore. When RUR 
occupies the space in which the vertical S1is located, it stabilizes helix S1 
upward, which contracts the pore. We propose atwo-section inhibitory 
mechanism, in which the S1 helix adjusts the pore size coarsely and the 
NTH makes fine adjustments, eventually physically occluding the pore. 
We speculate that helix S1 and the NTH of the 11 subunits may move 
individually (rather than concertedly) to assume the conformational 
change and thus adjust the pore size with high flexibility, which enables 
molecules of various sizes to pass through the pore. Our structures 
build a solid foundation for understanding the physiology and phar- 
macology of the CALHM family. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized and investigators were not blinded 
to allocation during experiments and outcome assessment. 


Cloning 

The full-length human CALHM1, CALHM2 and CALHM3 genes (http:// 
www.uniprot.org; UniProtKB numbers: Q8IU99, Q9HA72 and Q86xXJO, 
respectively) were synthesized by Genscript and subcloned intoa pEG 
BacMam vector comprising an N- or C-terminal thrombin cleavage 
site, GFP and His8 tag®”. We focused on CALHM2 because it showed the 
best biochemical properties among CALHM1, CALHM2 and CALHM3. 
(Extended Data Fig. 1g-i). CALHM2-mutant primers were synthesized 
from Eurofins USA and were produced using site-directed mutagenesis 
and/or primer extension PCR. 


Construct screening 

tsA201 cells were seeded into 2-ml plates to a final density of 0.9-1 x 
10° cells/ml in DMEM containing 10% (v/v) fetal bovine serum, trans- 
fected with N-terminally or C-terminally tagged CALHM1, CALHM2 and 
CALHM3 plasmids using Lipofectamine 2000 (ThermoFisher), and 
placed at 37 °C for 24 h. The next day, sodium butyrate was added to 
each well to a final concentration of 10 mM and placed at 30 °C. Then, 
24h later, cells were imaged, collected, washed with 150 mM NaCl, and 
20 mM Tris-HCI pH 8.0 buffer (TBS) and stored at -80 °C. Cell aliquots 
were thawed on ice, mixed with 1% digitonin-containing buffer and 
left to incubate on ice for 30 min. Mixtures were then centrifuged at 
87,000g using a TLA 100.3 rotor (Beckman Coulter) for 20 min at 4 °C. 
The supernatant was carefully pipetted off and injected onto a3 mI GE 
Healthcare Superose 5/150 GL column, prewashed with TBS buffer 
plus detergent, at a flow rate of 0.3 ml/min, and analysed to determine 
protein solubility and retention time. 


Protein expression and purification 

The following purification was revised accordingly, and carried out as 
previously described® *. DNA was transformed into DH10Bac cells to 
produce bacmid for baculovirus production. Flasks of tsA201 suspen- 
sion cells (ATCC CRL-11268, tested negative for mycoplasma contamina- 
tion, and authenticated) were grown toa density of 3.0-3.5 x 10° cells/ 
ml at 37 °C. P2 virus (8%) (v/v) was added to each flask and incubated 
at 37 °C for 12 h. Sodium butyrate was added to a final concentration 
of 10 mM and flasks were incubated at 30 °C. Cells were collected 48h 
after infection and frozen at —80 °C. For use, cells were thawed on ice 
and resuspended in TBS buffer supplemented with 1 mM PMSF, 0.8 1M 
aprotinin, 2 pg/ml leupeptin, and 2 mM pepstatin A; then, they were 
lysed via sonication for 15 min. The lysate was centrifuged at 7,000g 
for 10 min. The supernatant was ultracentrifuged at 186,000gina Ti45 
rotor (Beckman Coulter) at 4 °C for 1h. Membrane pellets were homog- 
enized in TBS supplemented with protease inhibitors and solubilized 
in 1% digitonin at 4 °C for 2h. Membrane debris was removed by cen- 
trifugation at 186,000g ina Ti45 rotor at 4 °C for1h. 

The solubilized protein was applied to 10 ml of TALON resin preequili- 
brated with 3 column volumes of TBS supplemented with 10 mM imi- 
dazole and 0.1% digitonin (buffer A). The column was washed with 
5 column volumes of buffer A, and eluted with 3 column volumes of 
elution buffer containing 250 mM imidazole, pH 8.0. Fractions contain- 
ing the N terminus or C terminus of CALHM2 were pooled and were 
concentrated directly for size-exclusion chromatography in TBS sup- 
plemented with 0.1% digitonin, pH 8.0. The initial purification using 
N-terminally or C-terminally GFP-tagged constructs used in cryo-EM 
yielded micrographs containing only hemichannels, and the three- 
dimensional reconstructions indicated fuzzy tails hanging outside of 
the detergent micelle, resulting in low-resolution reconstructions. To 
eradicate the flexible GFP tag, thrombin digestion was implemented 


overnight at 4 °C at a ratio of 1:20 thrombin:protein. The resulting peak 
fractions containing GFP-cleaved CALHM2C were then combined and 
concentrated to 8-9 mg/ml or tested directly via negative-stain grids. 
GFP-cleaved CALHM2C (the C-terminally GFP-tagged construct) in 
the presence of EDTA or RUR was frozen for further cryo-EM studies. 
A high concentration of RUR destabilizes CALHM2, based ona stabil- 
ity test using fluorescence-detection size-exclusion chromatography 
(FSEC) (Extended Data Fig. 1j). Therefore, we chose a RUR concentration 
of 1.5mM. 


Cryo-EM sample preparation and data acquisition 

Concentrated CALHM2C protein was preincubated with EDTA or RUR 
for 30 min onice. The protein mixture (2.5 pl) was then applied to glow- 
discharged Quantifoil carbon grids (gold, 1.2/1.3-m size/hole space, 
300 mesh), blotted for 2s at 100% humidity using a Vitrobot Mark III, 
and flash-frozen in vitreous liquid ethane. Particle images were col- 
lected using the FEI Titan Krios electron microscope equipped with 
anominal magnification 130,000x Gatan K2 Summit direct electron 
detector, recording image stacks in super-resolution counting mode 
at a binned pixel size of 1.026 A. Each image was dose-fractionated in 
40 frames using a total exposure time of 8s at 0.2s per frame. The dose 
rate was 6.76 e A*s". Allimage stacks were collected using SerialEM*, 
an automated acquisition program. Nominal defocus values varied 
from 1.0 to 2.5 pm. 


Cryo-EM data processing 

Movies were motion-corrected using MotionCor2”. Gctf** was applied 
tonon-dose-weighted micrographs to estimate defocus values. Parti- 
cles were picked using Gautomatch (http://www.mrc-Imb.cam.ac.uk/ 
kzhang/Gautomatch/;). Templates were generated from the initial pilot 
results and subjected to two rounds of two-dimensional classification 
using RELION 2.1 and RELION 3.0-beta-2 (ref. *”). In the EDTA dataset, 
CryoSPARC” was used to separate hemichannels and gap junctions 
from two-dimensional classification results. In addition, CryoSPARC 
was used to obtain the initial models of the hemichannel and gap 
junction. For the EDTA-CALHM2 data, the hemichannel particles and 
gap-junction particles were further cleaned up by two-dimensional 
classification, and the selected particles from two-dimensional clas- 
sification were subjected to three-dimensional classification in RELION 
using maps from CryoSPARC low-pass-filtered of 60 Aas reference 
models. For the RUR-CALHM2 data, the EDTA-CALHM2 hemichannel 
map was used as a reference model during three-dimensional classifica- 
tion. Particles from classes that showed high-resolution features were 
refined without applying symmetry, and particles from classes showing 
obvious C,, symmetry were further refined using C,, symmetry. In the 
RUR-CALHM2 dataset, in addition to the symmetric class that yielded 
the highest-resolution structure, two well-defined non-symmetric 
classes were observed. Both non-symmetric classes are ellipse-shaped 
when viewed perpendicular to the membrane, having the S1 helices 
inthe lifted conformation. In one of these two non-symmetric classes, 
prominent densities belonging to two subunits extending to the centre 
of the pore were observed. Notably, these two subunits are located on 
the opposite sides of the longer axis of the ellipse; it is possible that 
the ellipse shape is a result of the push between the S1 helix and NTH 
of these two subunits. To assess the structural heterogeneity of the 
first transmembrane helix S1, we analysed the particles that yielded 
maps with highest resolutions of RUR-CALHM2 and EDTA-CALHM2 
using an approach that combined symmetry expansion and signal 
subtraction, in which all of the subunits were subtracted and classified 
without image alignment in RELION. For RUR-CALHM2, two different 
conformations of helix S1 appeared with 70% in the lifted conformation 
and 13% in the vertical conformation attached to S3, with the rest not 
being well-defined. No RUR densities are observed inthe class with the 
vertical S1. For EDTA-CALHM2"™™, the three-dimensional classification 
of asingle subunit reveals that approximately 48% of the S1 helices are 


inthe vertical conformation and attached to S3, whereas the rest are not 
well-defined. We suggest that this may be caused by the high flexibility 
of helix S1in the absence of RUR, and that the non-resolved densities 
may present the intermediate states of S1 between the lifted and vertical 
conformation. By contrast, three-dimensional classification of single 
subunits of EDTA-CALHM2®* did not yield meaningful results. The 
proportion of the single subunits in a gap junction structure is prob- 
ably too small to perform reliable signal subtraction and subsequent 
three-dimensional classification. 


Model building 

Denovo model building of RUR-CALHM2 was carried out using Coot”, 
guided by bulky residues and secondary structure prediction. The 
architecture of CALHM2 is mainly a-helices, which greatly assisted in 
registering assignment. The EDTA-CALHM2 models were built based 
onthe RUR-CALHM2 model. Model building of the less-well-defined 
regions—including the NTH, S1 helix and part of the extracellular 
loops—were carried out using the maps refined without a soft solvent 
mask, and the maps of the single subunit obtained through symmetry 
expansion, signal subtraction and three-dimensional classification. 
The models were then subjected to real-space refinement using phe- 
nix.refine” with secondary structure restraints. The refined models 
were manually examined and adjusted in Coot. Although we can fit 
side chains in most parts of the protein, owing to intrinsic positional 
uncertainty we fitted only the protein backbone into the densities of 
the NTH, S1 helix and part of the extracellular loops based on second- 
ary structure prediction, and our electrophysiology experiment ona 
mutant (CALHM2(E37R)) located in SI. The exact position of residues 
in these regions (residues 13-40 and 138-152) should be interpreted 
with caution. For validation of the refined structures, Fourier shell 
correlation” curves were applied to calculate the difference between 
the final model and electron microscopy map. The geometries of the 
atomic models were evaluated using MolProbity®. Figures were created 
using PYMOL“ and UCSF Chimera*. 


Thermostability experiment 

Wild-type CALHM2 and mutants were transfected into tsA201 cells as 
described in the ‘Construct screening’ section. After collection, wild- 
type CALHM2 and mutants were extracted in TBS buffer supplemented 
with 10 mM n-dodecyl B-D-maltoside and 2 mM cholesteryl hemisuc- 
cinate Tris salt at 4 °C for 1h. Solubilized samples were ultracentrifuged 
at 87,000g for 20 min at 4 °C to remove cell debris and membranes. The 
resulting supernatants were heated at selected temperatures for 10 min 
and centrifuged at 87,000g at 4 °C for 20 min to remove any aggregates. 
Heated supernatants were loaded onto a 3-ml GE Healthcare Super- 
ose 5/150 GL column for high-performance liquid chromatography to 
measure GFP intensity (excitation 488 nm and emission 510 nm) and 
compared to 4-°C controls. 


Electrophysiology 

Flasks of tsA201 suspension cells were grown to a density of 1.0 x 10° 
cells/ml at 37 °C and infected with 1-5% (v/v) human CALHM2C P2 virus. 
After 12 h,5 mM sodium butyrate was added, and the cells were left 
to incubate at 30 °C for an additional 24 h. Infected cells were plated 
toa final density of 0.3 x 10° cells/ml in a 24-well plate on microscope 
cover glass (Fisher), incubated for 3 h and recorded 24-48 h after 
infection. Whole-cell patch-clamp recordings were collected at room 
temperature using a HEKA EPC-10 amplifier. Cells were held at-60 mV 
and data were recorded at 20 kHz and filtered at 1 kHz. Inhibitor- and 
activator-containing buffers were applied using a two-channel theta- 
glass pipette. The bath solution was prepared using 140 mM NaCl, 5.4 
mM KCI,5 mM CaCl, 1mM MgCl, 10 mM HEPES and 20 mM sucrose 
(pH 7.4), and electrodes were filled with an internal solution of 130 mM 
KCI, 10 mM NaCl, 1 mM CaCl,, 10 mM HEPES, and 11 mM EGTA (pH 7.3). 
All data were collected using Patchmaster software (HEKA). 


Deglycosylation test 

In brief, 500 pl of pelleted cells was thawed on ice, mixed with 100 
pl of 1% digitonin detergent and left to incubate for 45 min at 4 °C. 
Detergent-—cell mixtures were then centrifuged at 21,000g for 10 min 
at 4 °C. Supernatant was transferred into fresh Eppendorf tubes and 
centrifuged at 87,000g for 20 min at 4 °C. High-speed supernatant was 
then transferred into fresh Eppendorf tubes for a second time. Each 
sample was divided into four tubes; the first two tubes as the control 
and the other two tubes mixed with 0.5 pl (250 U) of endoglycosidase 
H (NEB PO702S) or PNGase F (NEB PO704S), respectively. Samples were 
left nutating overnight at 4 °C at optimum pH. The following day, all 
samples were centrifuged at 186,000g for 20 min at 4 °C, mixed with 5x 
sample loading buffer to a final concentration of 1x andrunona4-20% 
SDS-PAGE gel with protein standard. Unstained gel was analysed using 
a fluorescent gel imager to detect GFP. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 
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Thecryo-EM density map and coordinates of EDTA-CALHM2"*™", EDTA- 
CALHM2°*, and RUR-CALHM2 have been deposited in the Electron 
Microscopy Data Bank (EMDB) under accession numbers EMDB-20788, 
EMDB-20790 and EMDB-2078%9, respectively, and in the Research 
Collaboratory for Structural Bioinformatics Protein Data Bank 
under accession codes 6UIV, 6UIX and 6UIW, respectively. The sin- 
gle subunit map(s) obtained from signal subtraction and associated 
mask have been deposited under the corresponding EMDB accession 
number. 
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Extended Data Fig. 1|See next page for caption. 
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Extended Data Fig. 1| Electrophysiology experiments, construct screening 
and purification. a, b, Representative current traces recorded in whole-cell 
modeat -60 mV for cells expressing wild-type CALHM2, tsA control cells, tsA 
cells transfected with empty pEGC vector (a) or tsA cells expressing 
CALHM2(E37R) (b). Cells were switched from bath buffer that contained 5mM 
Ca?* to one that contained 0.5 mM EGTA (0 mM Ca") toinduce the current. The 
current was inhibited using a buffer that contained 100 phM RURand0.5mM 
EGTA. n=13,15,10 or 6 biologically independent experiments were performed 
for wild-type CALHM2, CALHM2(E37R), tsA control or pEGC control, 
respectively. c, Quantification of current amplitude in 0.5 mMEGTA, 100 pM 
RUR and 0.5 mM EGTA, and 5 mM Ca” conditions for CALHM2-expressing and 
control cells froma, b. Two-tailed paired and unpaired t-tests were applied to 
calculate P values (using GraphPad Prism 7), from within and outside of each 
cell type, respectively. Each dot indicates the value of one single independent 
experiment. RUR inhibition was nearly abolished in the CALHM2(E37R) 
mutant. d, Representative current-voltage relationships obtained by applying 
500-ms voltage pulses ranging from 140 mV to -140 mV froma holding 
potential of O mV (20-mV steps) to cells expressing wild-type CALHM2 (top), 
tsA control cells (middle) and tsA cells transfected with empty pEGC vector 
(bottom). Currents were recording in the presence of 5mM Ca”, 0.5 mMMEGTA 
(OmM Ca”") or 100 1M RUR and 0.5 mM EGTA. n=7, 6 or 4 biologically 


independent experiments were performed for wild-type CALHM2, tsA control 
or pEGC control, respectively. e, The data of wild-type CALHM2 ind were 
normalized to the amplitude of the current recorded in the presence of EGTA at 
140 mVand calculated as mean+s.e.m. from7 cells. f, Averaged current 
amplitude of cells expressing wild-type CALHM2 (n= 7), cells transfected with 
empty pEGC vector (n=4) and tsA control cells (n = 6) in the presence of 0.5 mM 
EGTA, plotted as mean +s.e.m.g, FSEC profiles showing that human CALHM2 
N-terminally tagged with GFP (CALHM2N) and human CALHM2 C-terminally 
tagged with GFP (CALMH2C) have the best biochemical properties among 
human CALHMI1, CALHM2 and CALHM3 when solubilized in digitonin. This 
experiment was repeated three times yielding similar results. h, Size-exclusion 
chromatography profiles of CALHM2C in digitonin before (blue) and after (red) 
GFP cleavage. After cleavage, the main peak shifted towards a smaller 
molecular weight. Purification of CALHM2C was repeated multiple times (>10), 
yielding similar results in each case. i, SDS gel of purified CALHM2C (indicated 
by red arrowheads) before (left band) and after (right band) GFP cleavage. 

j, Stability test of purified CALHM2C in the presence of RUR at two 
concentrations using FSEC, showing that a high concentration of RUR 
decreases CALHM2 stability. This experiment was repeated five times, each 
yielding similar results. 
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Extended Data Fig. 2| Factors that affect the formation of a gap junction. 
a, Arepresentative micrograph of purified CALHM2N by negative-stain 
electron microscopy; selected two-dimensional classes are shown below. 
Fuzzy areas (indicated by red arrowheads) are caused by flexible GFP tags; this 
is alsoimplied inc and d. Only hemichannels, and not gap junctions, appear in 
the micrograph. b, Arepresentative micrograph of purified CALHM2C (from 
which the GFP tag has been cleaved) by negative-stain electron microscopy. 
The fuzzy areas ina, c, d disappeared in the two-dimensional classes, which 
confirms that they are indeed due to the GFP tag. One of the two-dimensional 
class averages represents a gap junction (red circle).c, Arepresentative 
micrograph of purified CALHM2N by cryo-EM in the presence of 1mMEDTA, 
collected using the Titan Krios; selected two-dimensional classes are shown 
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below. Only hemichannels, and not gapjunctions, are observed. d, Arepresen- 
tative micrograph of purified CALHM2C by cryo-EM using the Talos Arctica; 
selected two-dimensional classes are shown below. Only hemichannels, and 
not gapjunctions, are seen. e, FSEC profile of wild-type CALHM2 and the 
CALHM2(N168A) mutant, expressed intsA 201 and N2a cells. This experiment 
was repeated three times, each yielding similar results. f, The wild-type 
CALHM2 ranat the same height in SDS gel before and after treatment using 
250 U of EndoH and 250 U PNGase at optimum pH. Moreover, the 
CALHM2(N168A) mutant ran at the same height as the wild-type CALHM2. 
These data suggest CALHM2is not glycosylated. This experiment was repeated 
multiple times (Endo H, n= 6; PNGase, n=3), all yielding non-glycosylation 
phenotypes. 
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Extended Data Fig. 3 | The workflow of cryo-EM data processing for EDTA- 
CALHM2.A total of 4,809 movies was collected using a Titan Krios equipped 
with K2. Particles were autopicked using Gautomatch, and visually examinedin 
RELION to eradicate false-positive selections. After manual clean-up, particles 
were subjected to two rounds of two-dimensional classification in RELION. Ab 
initio reconstruction with two classes was performed in CryoSPARC to 
separate hemichannel and gap-junction particles, and to generate initial 
models of the hemichannel and gap junction. Hemichannel and gap-junction 
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particles were further cleaned up using two-dimensional and three- 
dimensional classification in RELION. Particles from three-dimensional classes 
that showed high-resolution features and obvious C,, symmetry were 
combined and refined in RELION. To assess the structural heterogeneity of the 
helix S1, an approach that combined symmetry expansion and signal 
subtraction was carried out, in which all of the subunits were subtracted and 
classified without image alignmentin RELION. 
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Extended Data Fig. 4 | The workflow of cryo-EM data processing for RUR- 
CALHM2.A total of 5,025 movies was collected using a Titan Krios equipped 
with K2. Particles were autopicked using Gautomatch, and visually examined in 
RELION to eradicate false-positive selections. After manual clean-up, particles 
were subjected to two rounds of two-dimensional classification in RELION. 
Three-dimensional classification using the map of EDTA-CALHM2"™™ as an 
initial model yielded three classes with high-resolution features. Particles from 
the three classes were refined without applying symmetry, and particles from 
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the class that showed obvious C,, symmetry were further refined using C,, 
symmetry. The two non-symmetric classes are highlighted by the red ellipses. 
The densities of the S1 helix and NTH that extend to the pore centrein one ofthe 
non-symmetric class are labelled. To assess the structural heterogeneity of 
helix S1, an approach that combined symmetry expansion and signal 
subtraction was carried out, in which all of the subunits were subtracted and 
classified without image alignment in RELION. 
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Extended Data Fig. 5 | Cryo-EM analysis of human CALHM2. a, e, Representa- 
tive electron micrograph of RUR-CALHM2(a, out of 5,025 micrographs) and 
EDTA-CALHM2(e, out of 4,809 micrographs). b, f, i, Selected two-dimensional 
class averages of the electron micrographs of RUR-CALHM2 (b), EDTA- 
CALHM2"™ (f) and EDTA-CALHM22* (i). c, g,j, The gold-standard Fourier shell 
correlation (FSC) curves for the electron microscopy maps of RUR-CALHM2 
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(c), EDTA-CALHM2"*™ (g) and EDTA-CALHM2* (j) are shown in black, and the 
Fourier shell correlation curves between the atomic model and the final 
electron microscopy map are shown inred.d,h,k, The angular distribution of 
particles used for the refinement of RUR-CALHM2 (d), EDTA-CALHM2"™ (h) 
and EDTA-CALHM2®* (k). 
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Extended Data Fig. 6 | Representative densities of the reconstructions of (c) and EDTA-CALHM2®* (e), calculated using Bsoft**. b, d, f, Representative 
RUR-CALHM2, EDTA-CALHM2"*" and EDTA-CALHM2®"". a,c, e, Local- densities of RUR-CALHM2 (b), EDTA-CALHM2"*™ (d) and EDTA-CALHM2°®*" (f). 
resolution estimation of the structure of RUR-CALHM2 (a), EDTA-CALHM2"*™ The putative RUR-binding site density is shown in the panel onthe right inb. 
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Extended Data Fig. 7 | Comparison of EDTA-CALHM2"*™ and EDTA- 
CALHM2** with the connexin-46 gapjunction, innexin-6 gap junction anda 
VRAC. a, b, Overall structure comparison viewed parallel to the membrane 

(a, cartoon representation) and viewed from the intracellular side (b, surface 
representation), showing notable differences in symmetry, size and shape. The 
VRAC in bis viewed from the extracellular side. The size of the VRAC inb 
represents the largest diameter of the transmembrane domain. c, d, Single- 
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subunit comparison in two different views viewed parallel to the membrane. 
The intracellular domains are highlighted using grey ellipses, and the grey 
rectangles represent cell membranes. The buried surface area in each pair of 
protomers inthe CALHM2 gap junction is 378 A? (4,161 A? for an undecamer), 
whichis substantially smaller than the equivalent buried surface areain 
connexin (94 A?; or 5,654 A? for a hexamer) and innexin (1,550 A2; or 12,397 A2 


for an octamer). 
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Extended Data Fig. 8| Secondary structure arrangement and domain 
organization of human CALHM2, and sequence alignment of the human 
CALHM family. a, The secondary structure prediction of human CALHM2, and 
sequence alignment of CALHM family members CALHM1, CALHM2, CALHM3, 
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CALHM4, CALHMS and CALHM6. Secondary structure prediction was of the human CALHM2 protomer. 
performed using the JPred online server*’. Sequences were aligned using the 
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Clustal Omega program and coloured using BLOSUM62 by conservation. 
Residues involved in RUR binding are marked with black filled circles. The 
extracellular helix (EH)0 in the S3-S4 linker is formed upon the docking of two 
hemichannels. P86 inhuman CALHM1is boxed in red. b, Domain organization 
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Extended Data Fig. 9 | The role of NTH in inhibition by RUR. a, Representative 
current traces recorded in whole-cell mode at -60 mV in cells expressing 
CALHM2(R10A) or CALHM2(AN20). Human CALHM72 has three positively 
charged and two negatively charged residues inthe NTH, resulting in one net 
positive charge. Out of these charged residues, R10 is the only residue that is 
conserved across CALHM1, CALHM2 and CALHM3. Cells were switched from 
bath buffer that contained 5 mM Ca” to one that contained 0.5 mM EGTA (O mM 
Ca?*) to induce current. Current was inhibited using a buffer that contained 
100 uM RUR and 0.5 mM EGTA. n=14 or 10 biologically independent 
experiments were performed for CALHM2(R10A) or CALHM2(AN20), 
respectively. b, Quantification of current amplitude in 0.5 mM EGTA, 100 uM 
RUR and 0.5 mM EGTA, and 5 mM Ca” conditions for cells expressing 
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CALHM2(R10A) or CALHM2(AN20), froma. Two-tailed paired t-tests were 
applied to calculate P values for comparisons using GraphPad Prism 7. Dataare 
mean +s.e.m. Each dot indicates the value of one single independent 
experiment.c, Current-voltage relationships were obtained by applying 500- 
ms voltage pulses that ranged from 140 to -140 mV froma holding potential of 
OmV(20-mV steps) to cells that express CALHM2(R10A) or CALHM2(AN20). 
Currents were recorded in the presence of 5mM Ca”, 0.5 mMEGTA and 100 pM 
RUR. Data were normalized to the amplitude of the current recorded inthe 
presence of EGTA at 140 mV, and calculated as mean +s.e.m.n=7 biologically 
independent experiments were performed each for CALHM2(R10A) and 
CALHM2(AN20). 
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Extended Data Fig. 10 | The CALHM2 gap junction. a, Surface representation 
ofa gap junction viewed parallel to the membrane. Two paired subunits are 
highlighted. b, Surface representation of ahemichannel inthe gap junction, 
viewed from extracellular side. c, Interface remodelling when docking two 
hemichannels (shownin grey; top) into a gap junction (shown in colour; 
bottom). d, Cartoon representation of the interface betweentwo 
hemichannels. The two disulfide bonds are shown. The grey segment of S3-S4 
linker represents deleted residues ina mutant (CALHM2(A143-146)). Only 
parts involved in the docking of hemichannels and disulfide bonds are 
highlighted in colour. e, Selected two-dimensional class averages of 
CALHM2(A143-146). This mutant yielded only hemichannels, and not gap 
junctions. f, Superimposition of single subunits of EDTA-CALHM2™™ (grey) 
and EDTA-CALHM22* (pink) using the CTD. Only parts with conformational 
changes are highlighted in colour. To understand the conformational changes 
upon docking, we compared single subunits of ahemichannel anda gap 
junction. Inthe hemichannel, the loop connecting segment S3c and EH1in the 


$3-S4 linker is flat and lacks extensive contact with the rest of the protein, 
giving rise toa flexible area that is probably required for the initiation of 
docking. Indeed, the S3-S4 linker is defined better in RUR-CALHM2 thanin 
EDTA-CALHM2"*™', by forming interactions with the adjacent subunit. We 
suggest that the restricted S3-S4 linker in the RUR-CALHM2 hinders the 
docking of the hemichannels. g, Enlargement of the box inf, showing the 
remodelling of the S3-S4 linker from EDTA-CALHM2"™ (grey) to EDTA- 
CALHM2®* (pink). Upon docking, the S3c—EH1 loop remodels into two short 
loops anda short a-helix (EHO); the EHO-EH1 loop forms the primary interface 
and EHO forms the minor interfaceind. This motion accompanies an elevation 
of the S3-S4 linker and segment S3c that leads to an outward flexing of S3a, 
which breaks the loose interface between S3a and helix Slinf. Asa consequence, 
the S1 helix is detached from S3 and moves into a lifted conformation. The 
conformational changes of TMD upon docking in EDTA-CALHM2are notably 
consistent with those induced by RUR. Moreover, the two docked hemichannels 
inthe gap junction have similar conformations of their S1 helices. 
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in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


O A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


Oo For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


[ ] Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection SerialEM 3.7, Patchmaster 2x90.5 


Data analysis Gctf-1.06, Gautomatch-0.56, Relion-3.0, CryoSparc-v2, coot-0.8.9.2, pymol-2.3.2, Motioncor2-1.1.0, phenix.real_space_refine_dev_3500, 
phenix.molprobity_dev_3500, UCSF chimera_1.13.1, UCSF chimera_0.91, GraphPad Prism 7 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


The cryo-EM density map and coordinates of EDTA-CALHM2hemi, EDTA-CALHM2gap, and RUR-CALHM2 have been deposited in the Electron Microscopy Data Bank 
(EMDB) under accession numbers EMDB-20788, EMDB-20790 and EMDB-20789 and in the Research Collaboratory for Structural Bioinformatics Protein Data Bank 
under accession codes 6UIV, 6UIX and 6UIW. The single subunit map(s) obtained from signal subtraction and associated mask have been deposited under the 
corresponding EMDB accession number. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


x Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size All the electrophysiology experiments were repeated at least four times using different cells. The sample size was determined based on the 
consistence of the recordings. 


Data exclusions No data was excluded from the analysis 


Replication We have done each group of experiment with several batches of cells, different infections and with multiple independent researchers, to 
ensure reproducibility within the lab. 


Randomization — For electrophysiology experiments, cells with GFP fluorescence (proteins were GFP-tagged) were randomly selected. 


Blinding The investigators were blinded to group allocation during data collection and analysis 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 
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Human research participants 


Clinical data 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) Sf9 cells and tsA201 cells were purchased from ATCC 
Authentication Sf9 cells and tsA201 cells were authenticated 
Mycoplasma contamination Sf9 cells and tsA201 cells were tested negative form Mycoplasma contamination 


Commonly misidentified lines No commonly misidentified lines were used 
(See ICLAC register) 
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Corrections & amendments 


Author Correction: 
Maternal vitamin C 
regulates 
reprogramming of DNA 
methylation and 
germline development 


https://doi.org/10.1038/s41586-019-1699-9 


Correction to: Nature https://doi.org/10.1038/s41586-019-1536-1 


Published online 04 September 2019 


Stephanie P. DiTroia, Michelle Percharde, Marie-Justine Guerquin, 
Estelle Wall, Evelyne Collignon, Kevin T. Ebata, Kathryn Mesh, 
Swetha Mahesula, Michalis Agathocleous, Diana J. Laird, 

Gabriel Livera & Miguel Ramalho-Santos 


In this Letter, draft versions of the five Supplementary Data files were 
inadvertently uploaded. These Supplementary Data files have all now 
been replaced online. 
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Corrections & amendments 


Author Correction: 
Targeting cardiac fibrosis 
with engineered T cells 


https://doi.org/10.1038/s41586-019-1761-7 
Correction to: Nature https://doi.org/10.1038/s41586-019-1546-z 


Published online 11 September 2019 


Haig Aghajanian, Toru Kimura, Joel G. Rurik, Aidan S. Hancock, 
Michael S. Leibowitz, Li Li, John Scholler, James Monslow, Albert Lo, 
Wei Han, Tao Wang, Kenneth Bedi, Michael P. Morley, 

Ricardo A. Linares Saldana, Nikhita A. Bolar, Kendra McDaid, 
Charles-Antoine Assenmacher, Cheryl L. Smith, Dagmar Wirth, 

Carl H. June, Kenneth B. Margulies, Rajan Jain, Ellen Puré, 

Steven M. Albelda & Jonathan A. Epstein 


Inthis Letter, the bottom subpanel in the top left panel of Extended Data 
Fig. 8d, which shows CD3 staining of left ventricle tissue after AnglI/PE + 
FAP CART treatment, was inadvertently duplicated from the top image 
of Angll/PE staining. The figure has been revised and original images 
re-quantified. Figure 1 of this Amendment shows the incorrect and the 
corrected top left panel of Extended Data Fig. 8d, for transparency to 
readers. Extended Data Fig. 8 and its Source Data have been corrected 
online. (The Supplementary Information to this Amendment contains 
the incorrect Source Data, for transparency to readers.) 


Supplementary Information is available in the online version of this 
Amendment. 
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Fig. 1| This figure shows the corrected subpanel of the top left panel of Extended Data Fig. 8d, including the incorrect, published subpanel, for comparison. 
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Corrections & amendments 


Publisher Correction: 
Harnessing innate 
immunity in cancer 
therapy 


https://doi.org/10.1038/s41586-019-1758-2 


Correction to: Nature https://doi.org/10.1038/s41586-019-1593-5 


Published online 2 October 2019 


Olivier Demaria, Stéphanie Cornen, Marc Daéron, Yannis Morel, 
Ruslan Medzhitov & Eric Vivier 


In this Review, owing to an error during the production process, an 
incorrect version of Supplementary Table 1 was published. This has 
now been updated, and for transparency to readers the original Sup- 
plementary Table 1 is provided as Supplementary Information to this 
Amendment. Additionally, there were errors in the citations in the 
paragraph beginning ‘The interplay between innate and adaptive 
immunity.... These have been amended and references 11-13 have been 
renumbered in the reference list accordingly. The original reference 11 
(Scheper et al., Low and variable tumor reactivity of the intratumoral 
TCR repertoire in human cancers. Nat. Med. 25, 89-94 (2019)) is no 
longer cited in the Review, and reference 13 (Crozat et al., 2010) has 
been added and is cited in the above paragraph. The original Review 
has been corrected online. 


Supplementary Information is available in the online version of this 
Amendment. 
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Work 


Send your careers story 
to: naturecareerseditor 
@nature.com 


Researchers moving to a new country can benefit from support from local colleagues. 


SAY HELLO TO YOUR 
NEW BENCHMATE 


Howto welcome an international 


colleague. By Lara Pivodic 


nmyworkasahealth-sciences researcher, 
I’ve moved to several countries, includ- 
ing Belgium, the Netherlands and the 
United Kingdom. I've learnt that for every 
researcher who moves abroad, there are 
several more who will welcome that individ- 
ual and support their transition. In today’s 
global research environment, which values 
mobility and often expects it from researchers, 
these local teams have a responsibility to help 
colleagues from abroad have a good start. 
During my own international moves, I have 
experienced at first hand the difference thata 
supportive environment can make, alongside 
my own efforts to adapt to the new environ- 
ments. Here are seven hospitable things that 
research teams can doto help their colleagues. 


Welcome the newteam member. First impres- 
sions count on bothsides and can set the tone 
for days and weeks to come. It is crucial to be 
aware that someone is joining the group, to 
know the date of their first workday and to say 
‘welcome’ in some way. 

On the first day of my secondment to the 
United Kingdom asa PhD student, I had a meet- 
ing with my academic supervisors, one withan 
administrator for official intake and one with 
my ‘buddy’ — a fellow PhD student who helped 
me to take my first steps in the new place and 
introduced me to the team. 


Value input. This step applies evenifthey are 


staying for only a short period. Invite them to 
all activities that other laboratory members 
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are expected to attend, suchas seminars, jour- 
nal clubs or team-building events. 

In the first month of my six-month 
visiting-researcher appointment in the United 
Kingdom, I was invited to present my work at 
a monthly seminar, select an article for the 
journal club and join the institute’s work- 
ing group on research dissemination. This 
motivated me and made me feel like a valued 
member of the team. 


Assign a good workspace. Give them a place 
next to or near many colleagues. This will 
greatly boost your new member’s integration 
into the local team, help them to quickly grasp 
how things work in the new lab and spur the 
exchange of information and knowledge. 


Show interest. While supervising an exchange 
PhD studentin Belgium, l asked her how often 
she would like to meet with her collaborators. 
As ateam, we then adjusted to her preference. 
Embrace the opportunity to get to know dif- 
ferent styles of, for instance, project man- 
agement, giving and receiving feedback and 
supervising PhD students. 

The new colleague might havea career path 
that is not typical for your field in your coun- 
try. Rather than wondering whether they are 
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equally qualified, compare what you have both 
learnt in the course of your careers, and con- 
sider how your experiences and skills could 
complement each other’s. 


Talk extensively about cultural differences. 
The value of international collaborations 
comes from the different perceptions, com- 
munication styles, work styles, customs and 
other forms of variety that a new colleague 
can bring to both the work and the social 
spheres. Understanding and speaking about 
differences will help to prevent conflicts that 
might arise. If they occur nonetheless, you will 
be able to more easily agree on howto handle 
similar situations in the future. 

On arriving in Belgium, I noticed that I was 
used to a more ‘direct’ communication style 
than were my local colleagues, as a result of 
my previous research post in the Netherlands. 
This prompted me to ask the locals for advice 
on howto approach discussions with my PhD 
supervisors, interviews for grant applications 
or negotiations with project partners. lam 
convinced that such exchanges have greatly 
helped metoestablish and uphold successful 
collaborations. 


Offer help with practical matters. Navi- 
gating a new health-care system, finding a 
good phone plan or arranging childcare are 
time-consuming at best and nerve-racking 
at worst. I was forever grateful to a co-worker 
in Belgium who helped me to arrange health 
insurance and an annual public-transport pass 
and pointed out the place that sells the best 
bread in town. 

Ideally, your university’s international 
office should provide a welcome pack that 
explains which administrative and practical 
matters need to be arranged and how. Make 
sure that the new researcher gets one before 
their arrival. 


Pay attention tothe little things. Little things 
count for someone whose life — particularly 
their social life — has been turned upside down. 
Suggesting a coffee together outside work or 
offering to show your new colleague a nice 
place in town might help them to forget the 
stresses of relocating for a little while. 
During my stay in the United Kingdom, 
I joined a group of fellow PhD students for 
weekly Friday breakfasts at a cafe close to 
our university. These mornings allowed us to 
bond, and I learnt a great deal about UK life 
outside the workplace. If you get along with 
the newcomer, you might make a new friend. 


Lara Pivodic is a postdoctoral fellow of the 
Research Foundation-Flanders in Belgium and 
a senior researcher at Vrije Universiteit Brussel, 
where she conducts research on palliative and 
end-of-life care. 

Twitter: @LaraPivodic 
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DIVERSITY DEFICIT 
DRAGS ON 


There are promising signs for gender and ethnic 
representation in US graduate programmes, but 
parity is still far off, says study. By Virginia Gewin 


he number of Indigenous and Latinx 

students enrolling in US graduate-level 

programmes for the first time rose 

between autumn 2017 and 2018, 

according to areport from the Council 
of Graduate Schools (CGS) in Washington DC, 
which represents more than 500 universities, 
mainly in the United States. 

The report found that first-time enrolment 
in PhD and master’s programmes grew by 
8.3% and 6.8%, respectively, among Ameri- 
can Indian/Alaska Native and Latinx students 
(Latinx refers to US residents with origins in 


“Allfields need to bea 
welcoming place for people 
froma variety of different 
backgrounds.” 


Latin America). Among other science, tech- 
nology, engineering and mathematics (STEM) 
fields, maths and computer science saw a 
40% and 14.2% increase in the proportions 
of American Indian/Alaska Native and Latinx 
students enrolling, respectively. The propor- 
tion of black and African American first-time 
enrollees in physical and Earth sciences rose 
by 12.5%. Results are based on responses from 
589 institutions in an annual survey. 

Yet, overall, US graduate-level programmes 
still have low proportions of students from 
minority ethnic groups. Black and African 
American students comprise 11.8% of total 
first-time enrollees, and Latinx students 
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comprise 11.6%, the report found. By compar- 
ison, according to the 2010 US census, black 
and African American people represent 13.4% 
of the nation’s population, and Hispanic or 
Latinx individuals represent 18.3%. 

Indigenous students represent less than 1% 
of first-time enrollees, even though their total 
enrolment rose by 1.2% froma year ago. “More 
work has to be done in the graduate-education 
community to increase the representation of 
these students in the science, technology, 
engineering and mathematics fields,” says 
report co-author Hironao Okahana, the CGS’s 
associate vice-president for research and 
policy analysis. “All fields of study need to be 
a welcoming place for people from a variety 
of different backgrounds.” 

First-time international enrolment in 
graduate-level programmes fell for the fifth 
consecutive year, and is now down to 20% of 
all enrolment. The study found that the decline 
was most marked in engineering programmes, 
in which first-time enrolment of international 
students fell by 8.3% from 2017 numbers. 

Female students continue to be out- 
numbered by their male counterparts in 
some graduate-level STEM programmes. 
They account, for example, for only 38.2% of 
physical and Earth-sciences graduate students 
and 32.1% of maths and computer-science stu- 
dents. “While the rate of growth for womenin 
sciences looks good, thereis still along way to 
go to catch up,” says Okahana. 


Virginia Gewin is a freelance writer in 
Portland, Oregon. 
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APICTUREIS WORTHA 
THOUSAND BASE PAIRS 


Asmall but powerful toolset makes sharing genomic data 


visualizations straightforward. By Anna Nowogrodzki 


hen Adam Siepel was building 

algorithms for evolutionary 

genomics as part of his PhD, he 

wasn’t thinking about visualiza- 

tion. But, as a graduate student 

in the laboratory of computational biologist 

David Haussler, at the University of California, 

Santa Cruz (UCSC), he happened to sit next to 

the software engineers who were building and 

maintaining a tool called the UCSC Genome 

Browser. These engineers helped Siepel to make 

his algorithms publicly available as a track, or 
data overlay, that anyone could explore. 

Genome browsers are graphical tools that 

display the genome sequence, usually as a 

horizontal line. Other sequence-associated 


data are aligned and stacked above and below 
that line in ‘tracks’, for instance to illustrate the 
relationship between gene expression, DNA 
modification and protein-binding sites. 
Siepel’s track identifies sequences that have 
been retained over evolutionary time; whena 
user applies it while viewing the alignment of 
genomic data from two or more species, the 
track highlights regions that are evolution- 
arily conserved. Allowing others to use the 
algorithm to highlight regions of interest in 
their own data was “probably the single most 
important thing I did during my PhD”, says 
Siepel, who is now a computational biologist 
at Cold Spring Harbor Laboratory in New York. 
Other researchers have used it, for instance, to 
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find mutations associated with diseases and 
to pinpoint functionally important regions 
of noncoding RNA molecules. 

Today, a growing collection of free and 
open-source tools exists for sharing such 
genomic data. Which one is right for you 
depends on what kind of sharing you want to 
do: communicating with a collaborator, for 
instance, requires different software from 
what you'd use for disseminating data to the 
broader scientific community. 

Whatever the motivation, sharing genomic 
data broadens its impact, says Siepel. “Almost 
all of our most-cited papers are supported by 
browser tracks,” he says. 

For broad dissemination of genomic data, 
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Siepel recommends the approach that worked 
for him: making a track. And he suggests two 
genome browsers to display them: “UCSC and 
Ensembl are the leaders,” he says. 


Dissemination stations 


The UCSC Genome Browser and its ‘track hubs’ 
— data tracks that are hosted remotely by exter- 
nal teams — canrunina web browser or onthe 
desktop. The desktop version is called Genome 
Browser ina Box. Users can submit data to be 
included as a public track registered with 
UCSC. The team typically accepts only widely 
useful data (and not, for instance, those lim- 
ited to aspecific disease), and publicationina 
peer-reviewed journal is a plus. Alternatively, 
users can build personal track hubs, which 
involves formatting their genomic datain files 
of aspecific format, indexing those files and 
making them web-accessible. 

The Ensembl genome browser, hosted at 
the European Molecular Biology Laboratory's 
European Bioinformatics Institute in Hinxton, 
near Cambridge, UK, allows users to import 
data as custom tracks, just as UCSC does; 
both its own format and the UCSC formats 
are supported. The Ensembl team has built 
a searchable Track Hub Registry to make it 
easy to find relevant track hubs for use with 
Ensembl or the UCSC Genome Browser. 

The UCSC browser accumulates roughly 
two million hits a day, says Robert Kuhn, 
associate director of the UCSC project. And 
several large projects use it to disseminate 
data. The Genotype-tissue Expression Project, 
for instance, used the browser to create a 
track that visualizes as many as 53 tissues 
from 1,000 donors. The journal Nucleic Acids 
Research requires authors with whole-genome 
data to create a track hub for reviewers, Kuhn 
notes, and some authors choose to make them 
available to readers as well. 

If you have basic Unix command-line skills, 
setting up a UCSC track hub takes just a few 
hours, says Kuhn. Instructions are available 
at go.nature.com/2pqkym. 


Shareable, embeddable 


UCSC and Ensembl allow researchers to share 
datasets as tracks ina public, centralized data- 
base that is controlled by others and available 
to any user of that database. If your goal is to 
embed a visualization ina website, or tocreate 
a specialized visualization for a paper, other 
options are available; these include GIVE, 
JBrowse and IGV. (UCSC and Ensembl can also 
perform these tasks.) 

GIVE is an open-source tool that allows 
researchers to build custom genome browsers 
for their labs with little if any programming. 
According to Xiaoyi Cao, a GIVE developer 
and a software engineer at Google, there are 
three ways to host data. One is for research- 
ers to build an entire GIVE instance on their 
lab server using GIVE-Docker, a pre-packaged 
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version of GIVE that the container engine 
software Docker canrunimmediately. Because 
the data can remain ona private server, they 
do not have to be visible to the web and thus 
can be more secure. 

Alternatively, labs can submit a list of URLs 
that point to the data sets they want to include, 
and GIVE will build the database for them, no 
programming required. The data can be in 
any of multiple formats, including those for 
gene-expression and protein-binding data. 
The resulting database will be based at a GIVE 
instance, or mirror, hosted by the University 
of California, San Diego (UCSD). And accord- 
ing to Sheng Zhong, the UCSD computational 
biologist who heads the GIVE team, it takes just 
two to three minutes to set up. 

The third option is to include your data in 
the public GIVE data hub. Researchers submit 
their metadata to an online form, and the GIVE 


“We have all these wonderful 
new datatypes, and we have 
to figure out how to visualize 
and combine them.” 


developers will let them know if their data have 
been selected. 

JBrowse can run either in a browser or on 
the desktop. lan Holmes, JBrowse’s lead devel- 
oper and a computational biologist at the 
University of California, Berkeley, says that 
he designed the tool to be responsive, intuitive 
and accessible for non-coders. With the desk- 
top version, users can load data directly from 
their computer, Holmes says. The browser 
version requires an index file that tells the 
browser where to find the relevant data; it also 
requires data to be web-visible (for example, 
in the cloud or ona lab server). The JBrowse 
community has compiled a repository of 
about 50 plugins “that significantly enriches 
the visualization”, says Holmes. One example 
allows users to see all DNA methylation results 
inasingle track. 

Several specialized genome browsers and 
databases use JBrowse for data visualiza- 
tion; among these are the cancer genome 
browser COSMIC and VEuPathDB, a genomic 
database for pathogens and disease vectors. 
David Beare, a computational biologist at the 
Wellcome Sanger Institute in Hinxton, says that 
COSMIC uses JBrowse in part because “it was 
faster and more responsive, and certainly more 
intuitive” than other genome-browser options 
available. The VEuPathDB database develop- 
ers found that JBrowse “was most amenable 
to our own active development” of plugins, 
says Omar Harb, a microbiologist at the 
University of Pennsylvania in Philadelphia, 
and director of scientific outreach and 
education for VEuPathDB. 

JBrowse has also been used to build a 
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collaborative annotation tool called Web 
Apollo, which allows multiple researchers to 
simultaneously annotate the same datain real 
time, as in Google Docs. 


Get a Broad view 


IGVis a genome browser maintained by UCSD 
and the Broad Institute of MIT and Harvard 
in Cambridge, Massachusetts. Available in 
desktop, browser-based and embeddable 
Javascript versions, IGV can generate QR codes 
(square barcodes) for specific data visualiza- 
tions; for example, for inclusion ona poster. 
“IGV is always run locally on auser’s computer. 
There is no notion of ‘uploading’ data or saving 
sessions to a central IGV server hosted by us,” 
says Helga Thorvaldsdottir, a software engi- 
neer at the Broad Institute. That also makes 
the system compatible with restricted data. 

Jim Robinson, IGV’s lead developer at UCSD, 
says that the browser is fast and easy to use. 
“Most users can learn the basics ina half hour 
or less,” Robinson says. And the tool has racked 
up morethan 7,000 citations, Thorvaldsdottir 
says. At the Memorial Sloan Kettering Cancer 
Center in New York City, researchers have used 
IGV to visually check the genomic variants of 
patients whose cancer they sequence, says 
Robinson. 

Prospective users of these tools can find 
plentiful educational resources online, 
including video tutorials. The UCSC Genome 
Browser has two archived and searchable 
listservs, or electronic mailing lists: one 
for website and data questions, the other 
for queries on setting up and maintaining 
Genome Browser mirrors. JBrowse users can 
ask questions on Github or on the software’s 
open instant-messaging channel, but Holmes 
suggests contacting the developers directly. 
“We have some developers who really like 
getting feedback from users,” he says. 

And that includes suggestions for handling 
new challenges. Despite their utility, genome 
browsers are still mostly built on a fundamen- 
tal assumption: that genomic data are best 
displayed in a linear format. But that doesn’t 
work so well for some kinds of information, 
including interactions between distant 
genomic regions, and evolutionary relation- 
ships, says Siepel. Some researchers, such as 
Maria Nattestad, a bioinformatician at Google 
in Palo Alto, California, have built niche tools 
for tackling these issues. Nattestad built a tool 
called Ribbon to better visualize long read 
alignments, for instance: these can snarl up 
other browsers because they often align with 
more than two places in genome. 

“We have all these wonderful new datatypes, 
and we have to figure out howto visualize and 
combine them,” says Nattestad. “It keeps me 
up at night in the best way.” 


Anna Nowogrodzki is a journalist in the 
Boston area of Massachusetts. 
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define myself as a ‘grounded mystic’: 
that’s how I feel when 1 am in the desert. 
I’m trying to understand the Universe we 
are living in and howto search for life in it. 
That’s the ‘grounded’ aspect. The ‘mystic’ 
partis that I’m not afraid of letting my mind 
wander in those spaces. 

Iam an astrobiologist who specializes in 
planetary science, and my team and I study 
analogues of early Martian environments that 
might have been habitable for life as we know 
it. We want to understand the distribution 
and abundance of life in very, very harsh 
conditions — similar to what Mars might have 
been like 3.5 billion years ago — to determine 
how life forms survived under intense 
ultraviolet radiation. And we want to know 
how to detect and identify those life forms. 

Ilove the barren aspect of the Altiplano in 
South America (this picture was taken at Salar 
de Pajonales in Chile, at the southwestern 
edge of this vast Andean plateau). In deserts, 
you have to be face-to-face with yourself — 
there is nothing else. That gives me the space I 
need to get thinking. There are no limitations 
or constraints or boundaries. 

In my most recent research trip to Salar 
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headed ee Sux : 
dé Saeakee this autumn, we worked more 
onunderstanding the distribution patterns 
of microbial life there. Life’s distribution is 
fractal in nature and repeatable, but you have 
to understand the starting pattern. We are 
learning how to decode where to find extreme 
microbial life. We can then apply these codes 
to future missions to Mars. 

In 2006, we went scuba diving inthe 
crater lake of the Licancabur volcano, on the 
boundary between Chile and Bolivia. | was in 
completely transparent waters. The colours 
ranged from pale blue to dark blue, and you 
could see each ray of sunlight diffracted in 
the lake. Suddenly, it was as if the boundaries 
between me and that lake had completely 
disappeared. It was complete peace. 
Inscience, you have to find reasons: the 

whys and the hows, the whats and the whens. 
And this moment was without time, without 
space. The lake was not hostile, and there was 
no separation between it and me. 


Nathalie Cabrol is director of the SETI 
Institute’s Carl Sagan Center for the Study 
of Life in the Universe in Mountain View, 
California. Interview by Josie Glausiusz. 


Innovationse 


The 2NA Drug Revolution 


Curing What Ails Us 


Ea DOCTORS HAVE BEEN TREATING THE SYMPTOMS 
of most diseases, and not the source, for centuries. 


They have cut out tumors, unclogged arteries, in- 
jected insulin and soothed fevers—and have 
been unable to touch the biological code within 
cells that tells them to grow malignantly, pass 
along abnormal nerve signals, take in too much 
or too little energy, and swell with inflamma- 
tion. The code is the DNA molecule in each 
cell that tells it what to do and when, and it 

triggers dreaded diseases when it goes wrong. 
The molecule, and its messengers, had remained tucked away, beyond the 
reach of almost all drugs, unfixable when broken. But as this special report 
explains, that is no longer the case. 

Things began to change after the DNA sequence for the entire human 
genome was laid out early in this century, and within the past several years 
the ability to synthesize and custom-design shorter sequences has shown 
scientists that the best substance for reaching DNA is, well, DNA. Fabricat- 
ing new genes to replace badly working versions, or to “silence” them, has 
produced 14 approved DNA-related drugs (page S12). And the latest re- 
search indicates that such therapies can be even more effective if scientists 
depart from the basic linear strands and instead make DNA spheres, which 
have enhanced abilities to enter cells (page S3). DNA analysis has also yield- 
ed new targets, showing that although newborn babies in the U.S. are typi- 
cally screened for between 30 and 60 genetic conditions right now, it is pos- 
sible to find nearly 1,000 genes linked to childhood diseases that could be 
new treatment points ( page S8). 

But that same science has also created troubling issues: some of the gene 
tests for infants can raise false alarms, for instance, and not every child with 
a disease-associated gene ends up getting that disease. Research has also re- 
vealed unfair bias in DNA targets. Most of the data about those sequences 
comes from studies of white people and has missed gene variants that cause 
disease in nonwhites— inequality in research that will produce inequality 
in health if it isn’t fixed (page S14). Geneticists are starting projects designed 
to improve this diversity level. DNA in medicine has great power, and that 
power should be used for the many, not the few. 

This report on DNA drugs and related therapies, which is being published 
in Scientific American and Nature, is sponsored by UPMC. It was pro- 
duced independently by Scientific American editors, who have sole respon- 
sibility for all editorial material. UPMC agreed to sponsor this topic but 
had no input into the content. 
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DNA or RNA molecules, 
arranged into spherical shapes, 
can attack brain cancers and 
other illnesses that evade 
conventional drug design 


By Chad A. Mirkin, Christine Laramy 
and Kacper Skakuj 
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BRAIN CANCER IS TERRIFYING. It attacks an organ we see as the core of our personality, our mind, 
our very humanity. And because the disease grows inside the brain, it is notoriously difficult to 
treat. The organ has evolved many defenses to keep foreign substances out as a method of self- 
protection, but those substances include many anticancer drugs. Using knives or radiation on 
this citadel of consciousness carries tremendous risks. For these reasons, the five-year relative 
survival rate for people aged 55 to 64 who get glioblastoma, the most common type of primary 
brain tumor, is a grim 5 percent. The disease killed John McCain, Edward Kennedy and Beau 
Biden, and it takes the lives of about 15,000 less famous Americans every year. 


Now we have developed a nano-sized drug that travels through 
the body and into the brain, where it can kill off cancerous cells. 
These drug particles are composed of oligonucleotides—strands of 
DNA or RNA, the molecules that make up the master code that tells 
every cell what to do—and they stick out from a central core like the 
many spines of a sea urchin. The spiny round particles are called 
spherical nucleic acids. In an early trial with eight patients, these 
spheres went into glioblastoma cells and bound up other “code” 
molecules that are key to the cancer’s incessant growth. 

Such spherical drugs appear to work against a variety of diseases. 
Another terrible affliction, this one affecting infants, is spinal mus- 
cular atrophy, or SMA. It robs children of muscle control until swal- 
lowing and breathing become first difficult and ultimately impossi- 
ble. Most youngsters with the disorder succumb before they enter 
kindergarten, and until recently there was no help doctors could of- 
fer. In 2016 the U.S. Food and Drug Administration approved one 
remedy: a drug called Spinraza that is injected directly into the spinal 
cord several times every year and, at a list price of $125,000 per shot, 
is one of the most expensive drugs in the world. We recently com- 
pared our spheres, studded with nucleic acids that get inside cells 
and interfere with messenger molecules that lead to SMA’s symp- 
toms, with the Spinraza approach in studies of rodents. The spheres 
improved survival by four times—115 days versus 28 days—and the 
rate of toxic side effects was much lower. 

Spherical nucleic acids, or SNAs, avoid problems that have 
plagued the pharmaceutical industry's attempts to develop new 
drugs. Conventional drugs are nonspecific: they can affect many 
cells and organs, not just diseased ones; hence, they have numerous 
side effects. Nucleic acids, however, can be designed to interfere with 
only disease-causing genes or their related instruction molecules sent 
to control a cell’s behavior. Biologists have tried to use nucleic acids 
in the past but primarily as linear molecules and with little ability to 
direct where they go. And because the body has robust defenses 
against foreign genetic material—the immune system, for one—in 
most cases, these defenses damaged the drugs immediately or sent 
them to organs such as the liver and kidneys for waste removal. 

But SNAs, at only billionths ofa meter across, seem able to travel 
anywhere in the body and get inside cells before immune defenses 
can waylay them. The spherical shape lets us pack a high density of 
nucleic acid “spines” into a small space, and that density creates a 
strong interaction with receptors on cell surfaces that admit the par- 
ticles inside. There the sequence of the components— the same nu- 
cleotides, abbreviated as A, TI, C and G, that constitute the DNA 
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code of life—ensures that they affect only complementary sequenc- 
es of DNA or RNA. (The latter molecule uses U—uracil—instead 
of T, and we design for that.) We construct our strands to match 
only sequences in the cells that are crucial to the disease. SNAs are 
not magic bullets and will have to pass many more tests before they 
can be used on lots of patients. But the potential is there: because the 
nucleic components can be reordered to interfere with many differ- 
ent disease-causing molecules within cells, the spheres have the abili- 
ty to tackle some of the world’s most debilitating conditions. 


PROGRAMMABLE DRUGS 

TRADITIONALLY, SCIENTISTS have found disease treatments by screen- 
ing hundreds of thousands of small synthetic or natural molecules, go- 
ing through a long trial-and-error process to see if any of them have 
therapeutic benefits. Although this pipeline has led to a number of 
amazing medicines, such as antibiotics, even the most promising ones 
can cause unwanted side effects. Many other diseases are unaffected by 
these molecules and therefore still lack a cure or treatment. Even bio- 
logics, a newer class of drugs that are often based on proteins made by 
immune cells of mice, rabbits and other animals, typically rely on an 
abbreviated trial-and-error discovery process. 

An ideal drug-design process would allow scientists to rapidly and 
rationally design specific drugs that use the same language as our cells, 
instead of looking for a needle-in-a-haystack molecule. Cells commu- 
nicate many complex messages through DNA and RNA to make mil- 
lions of proteins. The number of steps that cells must execute correct- 
ly to make these proteins is staggering: they must select a specific se- 
quence of DNA made of A, T, C and G nucleotides, transcribe that 
sequence into a form called messenger RNA (mRNA), and then ac- 
curately read that mRNA to arrange molecules called amino acids 
into a chain—as long as 35,000 units— that forms a single protein. 

Errors where one nucleotide such as a T or a G is added, deleted 
or placed in an incorrect order can halt protein production or gener- 
ate an irregular protein that causes disease. Too many copies of an 
mRNA, and therefore of its related protein, can also lead to disease. 
(So can the introduction of foreign nucleic acids from a virus, which 
leads the infected cell to make harmful viral protein.) 

But we can synthesize our own stretches of DNA or RNA com- 
ponents, called oligonucleotides. Because the genetic alphabet has 
very specific rules—A can bind only to T, and C binds only to G— 
we can make our oligonucleotides with sequences that selectively 
bind to and inactivate one disease-driving sequence. When they do 
so, the synthetic oligonucleotides gum up the cellular works, pre- 
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venting the affected cells from producing a disease-causing protein. 

Yet despite automated equipment that can rapidly make synthetic 
oligonucleotides with any desired sequence one could imagine, fewer 
than a dozen oligonucleotide-based drugs have been approved for pa- 
tients. This is because these strands of oligonucleotides face a signifi- 
cant hurdle once they are injected into the bloodstream: because they 
are foreign— that is, not native to the patient—they get treated as 
hazardous material or waste. The body’s immune system either de- 
stroys these oligonucleotides, or the body’s waste-filtration stations, 
the liver and kidneys, remove them. They do not reach their intended 
target. Even if oligonucleotide strands could make it to a cell that con- 
tained the target mRNA, that cell has an outer membrane that acts as 
a barrier to prevent the oligonucleotides from getting inside. As a re- 
sult, drug companies working with oligonucleotides have often set- 
tled for treating diseases that can be targeted in the liver. The liver is an 
important organ. But sequestering these drugs in this one place really 
limits their use. (An alternative approach— injecting oligonucleo- 
tides directly into the disease site, such as into the 
spinal column with Spinraza— is technically diffi- 
cult and still does not ensure entry of the medicine 
into all the appropriate cells.) 


A SURPRISING RESULT 
ADVANCES IN NANOTECHNOLOGY made by our 
group at Northwestern University, along with sever- 
al other researchers, have led us to the SNAs, which 
may bea way around this problem. Prior to 2006, 
our group had been interested in using the highly 
specific binding ability of SNAs in probes for ultra- 
sensitive diagnostics —to fish out stretches of cancer 
DNA from blood samples, for instance. We could do 
this by chemically decorating a gold nanoparticle 
with many strands of DNA designed to anchor one 
end to the particle, producing the sea urchin spine 
pattern. The outer end of the DNA was designed to be a complemen- 
tary sequence to the cancer DNA sequence, so it worked nicely as a 
probe. We also used the spheres as artificial atoms with programmable 
bonds to fashion new types of materials. Drug design, however, was not 
really on our radar. After all, according to the dominant paradigm of 
drug biology and chemistry, RNA and DNA would not naturally cross 
cell membranes. 

We were curious, though, about how nucleic acids in this new ge- 
ometry would interact with living systems. Drug developers had al- 
ready been experimenting with single strands of oligonucleotides, 
with, as we noted, limited success. From our research with SNAs as a 
diagnostic platform, we knew that target DNA and RNA would 
bind to our clusters of spines much more strongly than they would 
attach to free oligonucleotide strands. The reason is that our spines 
are packed densely on the nanoparticle’s surface. That makes them 
more rigid, which helps the As, Ts, Gs and Cs on each strand align 
and bind when they encounter a target strand. This characteristic 
made us suspect that with the right nucleic acid sequences, SNAs 
could be a very potent oligonucleotide drug. 

To test this idea, we carried out an experiment that, at the time, 


we thought had only a slim chance of working. We took strands of 
free oligonucleotides and put them into a test tube with mouse cells. 
In a different tube we added a bunch of SNAs to the same type of 
mouse cells. We attached red fluorescent molecules to both the 
spheres and the strands to help us track them. When we looked at 
the cells under a microscope, the ones mixed with free strands ap- 
peared transparent, as expected. Free oligonucleotides did not cross 
the cell membrane. But the cells mixed with SNAs lit up the screen 
with bright red fluorescence. The spheres had made it inside! 

How could this happen? In general, cell membranes closely regu- 
late which molecules may enter, and oligonucleotides are not typical- 
ly among the approved guests. Furthermore, oligonucleotides carry a 
negative electrical charge, as do cell surfaces. Like two magnets, the 
two biological objects should repel each other. Yet when we repeated 
this experiment over and over again using more than 50 other human 
and animal cell types, all but one glowed red, a signal of success. 

Today we think we know what the gateway is: a type of doorway 


The ability of SNAs to reach the brain and their 
lack of toxicity generate hope for treating 


a dangerous cancer, as well as other 


neurological disorders, and set the stage 


for the next set of clinical trials. 


molecule called a scavenger receptor that dots the cell surface. These 
receptors play a major role when a cell engages with its environment; 
for example, they admit nano-sized biomolecules the cell needs. 
Some of the structural features at the ends of SNA spines happen to 
mimic the natural substrates of these scavenger receptors. As noted 
earlier, the strands on the spheres are densely packed, and like with 
Velcro, the more hooks, the stronger the bond. With free strands, 
even if scavenger receptors recognized them as molecules to take in, 
they have only one hook and float away. 

With the aid of an electron microscope, we could see that once 
an SNA binds to these receptors, the surrounding cell membrane 
folds inward to create a pocket, ushering the SNA into the cell. 


SPHERES AS MEDICINE 
BUT GETTING IN was only half the battle. To work as a drug, the SNA 
needed to find, bind to and inactivate a particular stretch of mRNA 
that instructed the cell to make a disease-associated protein. 
The first stretch of mRNA ina cell that we targeted did not cause 
disease but did instruct the cell to make a protein that glowed bright 
green under a microscope. Our goal was to stop this mRNA. When 
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we exposed mouse cells to an SNA designed to match that green- 
causing mRNA and compared them with similar cells that did not 
get the spheres, the color difference was clear. Sphere-free cells were 
bright green, showing the mRNA had encoded proteins. But cells 
exposed to our SNAs were transparent, meaning we had blocked the 
mRNA before it could pass along instructions to make anything 
green, as we reported in Science in 2006. 

Next we pitted SNAs against the major challenge plaguing linear 
oligonucleotide drugs: destruction by the body’s natural defense sys- 
tem. We found that our spheres have a strong electrical charge—again 
because of the dense packing—that helped them evade immune in- 
terference. This high charge inhibits defense molecules called nucleas- 
es, proteins that degrade foreign DNA and RNA, from getting close. 


REALITY TEST 

WE WERE ON TO SOMETHING, at least in the laboratory. Other scien- 
tists replicated and independently advanced some of our work, includ- 
ing dermatologist Amy Paller, Arthur Burghes, an expert on SMA, im- 
munotherapy specialist Bin Zhang, cancer biologist Alex Stegh, trans- 
plant surgeon Jason Wertheim, and oncologist Priya Kumthekar. But 
the path from benchtop breakthroughs to healthier patients is long 
and hard, so nearly 10 years ago researchers from our group founded 
a company called Exicure to advance SNA-based drugs to the clinic. 

We initially explored whether these potent drugs could be deliv- 
ered to diseased tissues in skin creams and eye drops, which is feasi- 
ble because SNAs are easily taken up by cells and a big improvement 
over invasive strategies such as direct injections. Two of our first tar- 
gets were psoriasis and poorly healing wounds, and there are several 
promising SNA candidates already in early-stage clinical trials for 
some of these ailments. 

Skin, of course, is relatively easy to get to. The brain is not. De- 
fended by a vigilant immune system and a web of blood vessels —the 
blood-brain barrier— designed to keep foreign molecules out, the 
brain makes cancers such as glioblastoma particularly difficult to treat. 
We thought, however, that SNAs might move across these defenses 
via the same doorway molecules that ease their path through cell 
membranes. Once in the brain the spheres could home in on cancer 
cells by targeting genes and proteins responsible for keeping the cells 
alive, which malignancies produce in excessive amounts. 

To start this project, we created an SNA drug with many short 
pieces of RNA specifically designed to knock down the production 
ofa protein in glioblastoma cells called Bcl2L12. That protein acts as 
a biochemical defender that helps to keep the cancer cells function- 
ing. We thought that by intercepting the mRNA that tells the cells to 
make this protein, the SNAs could make the cancer vulnerable to 
conventional medicines. Indeed, in our animal studies, reported in 
2013 in Science Translational Medicine, that is what happened: SNAs 
injected into the bloodstream of mice reached the brain, crossed the 
blood-brain barrier and prevented the production of Bcl2L12 pro- 
tein inside of glioblastoma cells. Last year early clinical results 
showed that these SNAs also reach glioblastoma cells in human pa- 
tients. We did not cure people, and we have yet to test whether the 
SNAs make the cancer cells more vulnerable. Still, the ability of 
SNAs to reach the brain and their lack of toxicity generate hope for 
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treating this cancer, as well as other neurological disorders, and set 
the stage for the next set of clinical trials. And tests in other diseases, 
such as spinal muscular atrophy, show promise in animals. 

Another exciting direction for SNAs is their use as immunothera- 
pies against cancer. Cancer cells often have proteins in their mem- 
brane that are different from the proteins found in healthy cells. 
Therefore, a cancer cell protein can act as a red flag, and ifour immune 
system can be trained to go after it the way it goes after a flu virus, our 
own bodies can do a better job of protecting us from the disease. 

To make an SNA cancer vaccine, we exchanged the gold-nanoparti- 
cle core for a hollow nanoparticle called a liposome, filled it with one of 
these red-flag proteins and injected it into animals with the correspond- 
ing cancer. Some of our most recent experiments, published in 2019 in 
the Proceedings of the National Academy of Sciences USA, showed that 
such SNAs elicit an immediate immune response to the tumor, appar- 
ently teaching the immune system to go after cells showing that red 
flag. The effects appear long-lasting, too: the immune system keeps go- 
ing after cells with that protein after the SNAs have vanished. SNAs are 
already showing potency and safety in phase I clinical trials in humans, 
and other spheres targeting a deadly skin cancer are being tested in a 
separate set of safety trials. 

SNAs are, however, not yet approved drugs. There are a number of 
challenges that they have to overcome first. Because the spheres do get 
to a wide set of cells, we need to carefully study whether or not they 
produce any negative “off target” effects even though their design 
should limit them to only problem DNA and RNA. Larger patient 
populations must be explored, and we need to improve targeting to 
increase the amount of drug that gets to the affected organ and cells. 

We think the ability of SNAs to access so many different tissues is 
game-changing and will be central to the emergence and ultimate 
widespread use of such medicines. SNAs are the product of three core 
capabilities: the ability to make large quantities of oligonucleotides, an 
understanding of genetic disease pathways, and the ability to get such 
oligonucleotides into tissues and cells that matter. The first two are im- 
portant, but without the third the process is like making software 
without hardware it needs to run on. SNAs may be that crucial and 
versatile hardware—a platform able to be reused for many different 
types of illness, one that begins to move the pharma industry away 
from the difficult search for entirely new molecules for every new 
treatment. An SNA simply needs a different set of oligonucleotides to 
be sent after a new disease. And we are just getting started. 


Chad A. Mirkin is director of the International Institute for Nanotechnol- 
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DNA to Treat DNA 


Within a cell, aberrant DNA—and the messenger RNA 
(mRNA) it uses to tell the cell what to do—can cause 
disease. Scientists can synthesize DNA that specifically 
binds to such problem molecules. When formed into 
spherical nucleic acids (SNAs), it penetrates cells and 
interferes with the trouble-causing molecules. 


LINEAR LIMITS 

DNA or RNA drugs have been 

tried with the more typical linear 

strands of the molecules. These Linear form 
(oligonucleotide) 


can work but often have difficulty 
entering a cell or are destroyed 
by immune defenses. They usual- 
ly need to be injected directly into 
a disease site, which limits use. 


Overexpressed mRNA 


DISEASE CONTINUES 
Molecules of mRNA 

associated with disease, 
which instruct the cellto make  L- 


Unwanted proteins 3 2 


Traditional small 
molecule drugs target 
the resulting proteins 


Illustration by Emily Cooper 


proteins, are alaurnar| S \ 


SNAs start with a 
core, often made from 
a nanoparticle called 

a liposome. Custom- 
made single-stranded 
DNA is packed densely 
around that core. 


Nanoparticle core 


Anchor 


Single-stranded DNA 


SPHERICAL SUCCESS 
On the surface of the SNAs, Objects 
the many strands of DNA show not drawn 


abundant attraction points to 
cell doorways called scavenger 
receptors, in contrast to the 
single “hook” of a free strand. 
Thus, the spheres are more 
easily taken inside the cell. 


to scale 


Scavenger 
receptors 


ILLNESS INTERRUPTED 


The custom sequences 
on SNAs bind only to the 
mRNAs involved in disease, 


Fewer ignoring other molecules. 

unwanted Once captured in this way, 

proteins the mRNAs can no longer 
tell the cell to make proteins 
that harm the body. 
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MEDICAL TESTS 


We now have the ability to screen 
for thousands of genetic diseases 
in newborns. That may not always 
be the healthy thing to do 


By Tanya Lewis 


MITCHELL GORBY CAME INTO this world 
around 3 P.M. on August 9, 2019, at Balboa 
Naval Hospital in San Diego. The baby seemed 
healthy, and his parents, Tiffany and Rylan, 
were thrilled. But a few hours later a nurse 
noticed that Mitchell seemed lethargic and 
never cried, and monitors indicated that his 
body was not getting enough oxygen. Mitchell 
was rushed to the neonatal intensive care unit 
at nearby Rady Children’s Hospital, where tests 
revealed that oxygen wasn't bonding to the 
molecule that carries it through the blood, 
hemoglobin, and his red blood cells were dying 
off. He wasn’t nursing, so the hospital put in 

a feeding tube. Mitchell’s doctor ordered CT 
and brain scans and tested for infectious dis- 
eases—but she could not figure out what was 
wrong with him. As a last resort, she suggested 
sequencing Mitchell’s genome. 


The results from Stephen Kingsmore’s laboratory at the Rady 
Children’s Institute for Genomic Medicine came back within about 
48 hours. Mitchell had a rare genetic mutation known as hemoglobin 
Toms River, which prevents oxygen from bonding to the proteins in 
fetal red blood cells. The mutation—named after the New Jersey 
hometown of the first patient identified with the problem in 2011— 
affects only fetal hemoglobin; babies start making healthy adult he- 
moglobin within a few months. Doctors just had to keep Mitchell 
alive until that happened. Rady neonatologist Jeanne Carroll says that 
“having his whole genome allowed us to know the starting point” for 
treatment. She and Mitchell’s team of physicians prescribed a series of 
blood transfusions, and the baby improved rapidly. In just under a 
month he was strong enough to go home. 

For children like Mitchell who are born with a genetic disease, it 
used to take years to get a diagnosis, and by then it often was too late. 
Now, however, advances in the speed of genetic sequencing and steep- 
ly falling costs have made it possible to screen for hundreds or even 
thousands of childhood-onset genetic diseases. Within the past year 
or so a few dozen hospitals have started offering the ability to rapidly 
sequence a newborn’s genome to help diagnose a life-threatening con- 
dition soon after birth. Researchers are studying whether such se- 
quencing should be offered to all newborns as part of standard health 
screening. And companies such as Sema4 and BabyGenes are now 
marketing 23andMe-style direct-to-consumer tests to parents simply 
seeking to know more about the health of their baby. Prenatal and 
newborn genetic sequencing is expected to grow to an $11.2-billion 
industry by 2027, up from a $4-billion market in 2018. 

Proponents say that genetic testing of newborns can help diagnose 
a life-threatening childhood-onset disease in urgent cases and could 
dramatically increase the number of genetic conditions all babies are 
screened for at birth, enabling earlier diagnosis and treatment. It could 
also inform parents of conditions they could pass on to future chil- 
dren or of their own risk of adult-onset diseases. Genetic testing could 
detect hundreds or even thousands of diseases, an order of magnitude 
more than current heel-stick blood tests—which all babies born in 
the U.S. undergo at birth—or confirm results from such a test. 

But others caution that genetic tests may do more harm than 
good. They could miss some diseases that heel-stick testing can detect 
and produce false positives for others, causing anxiety and leading to 
unnecessary follow-up testing. Sequencing children’s DNA also raises 
issues of consent and the prospect of genetic discrimination. 

Regardless of these concerns, newborn genetic testing is already 
here, and it is likely to become only more common. But is the tech- 
nology sophisticated enough to be truly useful for most babies? And 
are families—and society—ready for that information? 


IN THE 1960S MICROBIOLOGIST Robert Guthrie developed a test for 
phenylketonuria (PKU), a genetic disorder that causes the amino acid 
phenylalanine to build up in the body. PKU is easily treated with a phe- 
nylalanine-restricted diet, but without intervention it can cause brain 
damage and mental disabilities. Within a few years other U.S. states re- 
quired that Guthrie’s test be administered to newborns, and tests for 
other conditions were soon to follow. By the mid-1980s most states had 
mandatory screening programs. In 2002 the federal government asked 
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the American College of Medical Genetics to develop guidelines for 
newborn screening, which culminated in the Recommended Universal 
Screening Panel, a set of 35 core conditions and 25 secondary ones that 
are treatable. Most states now test for a subset of these conditions. 

There are roughly 14,000 known genetic diseases in humans, 
ranging from childhood-onset diseases such as PKU and congenital 
heart disease to adult-onset conditions such as Huntington's disease 
and heritable forms of cancer. Some childhood diseases, such as PKU, 
are treatable if caught early. Heel-stick tests look for only a tiny frac- 
tion of these diseases, hence the appeal of genetic testing. 

In the early 2010s researchers at the National Institute of Child 
Health and Human Development and the National Human Ge- 
nome Research Institute launched a program, called NSIGHT (short 
for Newborn Sequencing in Genomic Medicine and Public Health), 
to explore the risks and benefits of DNA screening of newborns. 
Rady’s Kingsmore led one of four projects funded by NSIGHT, 
which explored the use of rapid, whole-genome sequencing in ex- 
tremely sick newborns suspected of having a genetic disease. 

Standard sequencing can take weeks, but using a rapid sequenc- 
ing method and software that compared the genome with the pa- 
tient’s disease characteristics, Kingsmore’s team could get a genetic 
diagnosis back in as little as a day or two. For these babies, hours or 
days can be the difference between life and death or severe disability. 
‘The first of two trials led by Kingsmore took place from 2014 to 
2016 at Children’s Mercy Hospital in Kansas City. The second ran 
from 2017 to 2019 at Rady Children’s. Within the past year the 
group has started offering newborn sequencing at 23 hospitals 
around the country, and lawmakers from California have introduced 
federal legislation to cover the cost of sequencing critically ill babies 
through Medicaid. As of last November, Kingsmore and his col- 
leagues had sequenced more than 1,100 babies with suspected ge- 
netic diseases. About one in three of them received a diagnosis that 
identified an illness, and one in four had their existing treatment 
changed as a result. 

Mitchell Gorby was one of those sequenced at Rady (but not as 
part of NSIGHT). Carroll, the Rady neonatologist, says the informa- 
tion “helped us more confidently give him more transfusions and 
hold off on other testing.” It is possible Mitchell may have survived 
and outgrown his disorder without the test and diagnosis. But in oth- 
er cases, sequencing has very likely saved lives. Moreover, sequencing 
probably significantly reduced the diagnostic odyssey such children 
have to take, Kingsmore says. 


EXTREMELY SICK BABIES are not the only ones who could benefit 
from genetic testing. Another NSIGHT project investigated wheth- 
er sequencing could also be used in clinical settings to screen new- 
borns with no obvious signs of disease. 

For this study, called the BabySeq Project, Robert Green of Brig- 
ham and Women’s Hospital, Alan Beggs of Harvard Medical School 
and their colleagues recruited families and randomly assigned half of 
them to have their babies’ genomes sequenced. They developed a list 
of about 1,500 genes that were highly associated with diseases that be- 
gin in childhood or adolescence, then returned information about a 
subset of those genes to the families. The goal was to do the most 
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genetic diseases can be spotted 
by blood tests for newborns 

used in many states. The tests 
look for parts of proteins or other 
molecules linked to treatable 
gene-associated ailments. 


illnesses can now be identified 
through DNA itself, using one of the 
more popular commercial genetic test 
panels for newborns, Sema4’s Natalis. 
Like state blood tests, Natalis screens 
for diseases that are treatable. 


1,514 


genes, each responsible for a 
different childhood disease, were 
identified in a research study on 
newborns called BabySeq. It looked 
for DNA tied to treatable illnesses, 
for genes that can affect responses 
to drugs, and for genes that would 
not affect the particular baby 

but could be passed on and cause 
disease in future generations. 
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“Whenever you have the chance to learn about 
the health of your child, there’s an opportunity 
for anxiety.” 


comprehensive testing possible—to see anything 
and everything that could be discovered about 
gene-based risks. Last January the group reported 
sequencing results from 159 newborns—mostly 
healthy babies but also some ill ones in the neo- 
natal ICU. The scientists found that 9.4 percent 
of the healthy group were at risk of developing a 
childhood-onset disease that was not known 
from their medical or family history, and 88 per- 
cent were carriers for recessive diseases. 

So was the testing worth it for parents? A 
mother named Natalie, who requested we use only her first name out 
of concern for her family’s privacy, has a son who was enrolled in 
BabySeq. Natalie, who is a physician and lives in Washington, D.C., 
admits she felt some nervousness about the testing. “Whenever you 
have the chance to learn about the health of your child, there’s an op- 
portunity for anxiety,” she says. But overall, she and her husband were 
comfortable with the project. “Because they were looking at only ge- 
netic defects that affect childhood and only illnesses that had some 
preventive measures, we felt it could potentially be useful,” she says. 

Fortunately, the results of tests on her son, Russell, did not turn up 
any childhood-onset genetic disorders. The exams did indicate that he 
may be a carrier for a recessive metabolic disorder called Gaucher dis- 
ease, but the sequencing of this gene is particularly prone to error, so 
he will need follow-up testing to confirm. For other families, the ben- 
efits of sequencing were more clear-cut: one child had a disorder— 
missed by standard screening—that makes the body unable to recycle 
a vitamin called biotin; the condition can cause coma and death if left 
untreated, but it can easily be treated by supplementation. 

Although BabySeq was initially focused only on childhood-onset 
disorders, one baby in the study was found to carry a variant of the 
BRCA2 gene, which is associated with a high risk of breast and other 
cancers, so the researchers asked parents for permission to inform 
them of the risk of adult-onset disorders if they chose. Natalie and her 
husband opted not to receive this information but said they would 
leave it up to Russell if he wanted to be tested when he was older. “We 
felt it should be our son’s decision,” Natalie says. 


BECAUSE OF ITS COMPLEXITY and cost, BabySeq was never intended 
to bea feasible addition to standard newborn screening. “We have not 
tried to advocate for this in clinical practice,” Green of Brigham and 
Women's says. But sequencing tests are no longer confined to clinical 
practice. Several companies now offer direct-to-consumer DNA tests 
for newborns. The firm Semaé4 sells a test for $379 that it says screens 
for more than 190 genetic conditions that can occur before the age of 
10 and that can be treated with medication, diet or other interventions. 
‘The company gives results to parents in a genetic-counseling session 
about four to six weeks after the test. Sema4’s CEO, Eric Schadt, says 
the test can detect disease-related genetic variants with 99 percent accu- 
racy. Sema4 only reports results for diseases that have a greater than 
80 percent penetrance—the proportion of people with a genetic vari- 
ant who end up developing the disease. It also discloses information 
about the child’s sensitivity to certain drugs, although the U.S. Food and 
Drug Administration has recently been pressuring companies not to 


—Natalie, BabySeq parent 


make such information available, because it says that it has not reviewed 
the tests and that they may not be backed up by clinical evidence. 

Another company, BabyGenes, offers a test that scours 100 genes 
for more than 72 conditions. It is offered in the form of either a cheek 
swab or dried-blood spot test and retails for $349. 

Schadt admits Sema4 doesn’t know whether the kind of testing it 
offers leads to an overall benefit for patients, although he says the 
company is doing studies to find out. There are reasons to wonder. 
‘The accuracy of these tests in detecting disease is still uncertain. In a 
third NSIGHT project, led by Jennifer Puck, Barbara Koenig and 
Pui-Yan Kwok of the University of California, San Francisco, re- 
searchers sequenced the DNA of dried spots of blood left over from 
newborn heel-stick tests (California has kept all its blood spots since 
the early 1980s). Although the sequencing did detect some genetic 
conditions that the standard newborn screening panel does not test 
for, it missed some of those that standard screening caught. And it 
flagged a lot of genetic variants of unknown significance, Puck says: 
“Newborn screening is very different from having a sick individual in 
front of you for whom youre trying to arrive at a diagnosis.” 

When combined with the standard screening, DNA testing did 
reduce the number of false positives, however. Puck thinks sequenc- 
ing could be an add-on to standard screening when there’s an abnor- 
mal result, but she doesn’t think it should be used to screen all healthy 
babies. “We're just not at the point where we can interpret the se- 
quence with sufficient predictive value to say ‘yes’ or ‘no, this is a dis- 
ease or not,” she says. 

Another issue that concerns physicians and medical ethicists is the 
possibility that genetic testing will cause unnecessary anxiety for par- 
ents about diseases that may appear later in life or never show up at 
all. “When it comes to genetic information about your child, a lot of 
people aren't in a position to well interpret what the results mean,” 
says Nita Farahany, a professor of law and philosophy at Duke Uni- 
versity School of Law, who is an expert in genetics and bioethics. “If 
they're told their child has a four times greater risk [of some condi- 
tion], but the population risk is 1 percent, how do they treat their 
children?” There is already a shortage of genetic counselors in the 
U.S., so there would not be enough people to help parents under- 
stand their child’s genetic results. 

‘Then there's the issue of privacy. If the child’s genetic information 
is stored on file, who has access to it? If the information becomes pub- 
lic, it could lead to discrimination by employers or insurance compa- 
nies. The Genetic Information Nondiscrimination Act (GINA), 
passed in 2008, prohibits such discrimination. But GINA does not 
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apply to employers with fewer than 15 employees and does not cover 
insurance for long-term care, life or disability. It also does not apply to 
people employed and insured by the military’s Tricare system, such as 
Rylan Gorby. When his son’s genome was sequenced, researchers also 
obtained permission to sequence Rylan’s genome, to determine if he 
was a carrier for the rare hemoglobin condition. Because it manifests 
itself only in childhood, Gorby decided taking the test was worth the 
risk of possible discrimination. 

Cost is another consideration. Clinical sequencing is still about 
$500 to $800, and interpretation can be upward of $1,000, according 
to Brigham and Women’s Green. For families who can’t afford health 
insurance, this is out of reach. Some experts have also raised concerns 
that genetic testing could lead to a lot of follow-up testing with special- 
ists, which could overburden an already resource-strapped health care 
system. If sequencing turns out to save money in the long run, insur- 
ance companies may cover it, but there’s no guarantee. 

Yet another problem is that the majority of the sequencing to date 
has been done in babies whose families are well-off and white, raising 
concerns that this could become the province of only the privileged. 
And the racial homogeneity could skew the results: diseases more 
prevalent in Caucasian individuals could be overrepresented in test 
panels, whereas illnesses more common in racial minorities may be 
underrepresented. (New medical data projects intend to address this 
disparity [see “All of Us,” on page S14].) 


THE U.C.S.F. NSIGHT PROJECT included a working group that investi- 
gated some of these ethical and policy issues, which culminated in a 
2018 report by the Hastings Center, a bioethics nonprofit in Garrison, 
N.Y. The report concluded that newborn sequencing has many bene- 
fits in helping diagnose sick babies and could expand the number of 
conditions that meet the stringent newborn screening criteria. But us- 
ing genome sequencing as a replacement for newborn screening is “at 
best premature,” the authors say, and direct-to-consumer sequencing 
should not be used for diagnosis or screening purposes. 

Barbara Koenig, a professor of medical anthropology and bioeth- 
ics at U.C.S.F and one of the report’s co-authors, underscores the fact 
that sequencing, while promising, is not yet mature enough to be 
routinely used to screen healthy children. “This is not a technology 
that’s ready for prime time for use in healthy infants,” Koenig says. 

Despite these concerns, the era of newborn sequencing is now 
upon us, and the practice will likely become more widespread as costs 
come down and the results become more accurate and useful. In the 
meantime, the risks and benefits of sequencing must be weighed on 
an individual basis. Extremely sick newborns are a completely differ- 
ent case from apparently healthy children of worried parents suscepti- 
ble to marketing from genetic-testing firms. 

For Mitchell Gorby, sequencing was certainly worth it. Two 
months after leaving the hospital, he is doing fine and has doubled his 
weight. His parents are settling into their new routine, somewhat 
sleep-deprived, but happy to be home with their healthy baby boy. 


Tanya Lewis is an associate editor who covers health and medicine 


at Scientific American. 
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Gene Therapy Arrives 


After false starts, drugs that manipulate 
the code of life are finally changing lives 


By Jim Daley 


The idea for gene therapy—a type of DNA-based medicine that 
inserts a healthy gene into cells to replace a mutated, disease- 
causing variant—was first published in 1972. After decades 

of disputed results, treatment failures and some deaths in 
experimental trials, the first gene therapy drug, for a type of skin 
cancer, was approved in China in 2003. The rest of the world 
was not easily convinced of the benefits, however, and it was not 
until 2017 that the U.S. approved one of these medicines. Since 
then, the pace of approvals has accelerated quickly. At least nine 
gene therapies have been approved for certain kinds of cancer, 
some viral infections and a few inherited disorders. A related 
drug type interferes with faulty genes by using stretches of DNA 
or RNA to hinder their workings. After nearly half a century, the 


concept of genetic medicine has become a reality. 


GENE INSERTION 


These treatments use a harmless virus to carry a good gene 
into cells, where the virus inserts it into the existing genome, 
canceling the effects of harmful mutations in another gene. 


GENDICINE: China's 
regulatory agency approved 
the world’s first commercially 
available gene therapy in 
2003 to treat head and neck 
squamous cell carcinoma, a 
form of skin cancer. Gendicine 
is a virus engineered to carry 
a gene that has instructions for 
making a tumor-fighting pro- 
tein. The virus introduces the 
gene into tumor cells, causing 
them to increase the expres- 
sion of tumor-suppressing 
genes and immune response 
factors.The drug is still 
awaiting FDA approval. 
GLYBERA: The first gene 
therapy to be approved in 

the European Union treated 
lipoprotein lipase deficiency 
(LPLD), a rare inherited 
disorder that can cause 
severe pancreatitis. The drug 
inserted the gene for lipopro- 
tein lipase into muscle cells. 
But because LPLD occurs 

in so few patients, the drug 
was unprofitable. By 2017 
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its manufacturer declined 

to renew its marketing 
authorization; Glybera is 

no longer on the market. 
IMLYGIC: The drug was 
approved in China, the 

U.S. and the E.U. to treat 
melanoma in patients who 
have recurring skin lesions 
following initial surgery. 
Imlygic is a modified genetic 
therapy inserted directly into 
tumors with a viral vector, 
where the gene replicates 
and produces a protein 

that stimulates an immune 
response to kill cancer cells. 
KYMRIAH: Developed for 
patients with B cell lympho- 
blastic leukemia, a type of 
cancer that affects white blood 
cells in children and young 
adults, Kymriah was approved 
by the FDA in 2017 and the 
E.U. in 2018. It works by 
introducing a new gene into 
a patient’s own T cells that 
enables them to find and kill 
cancer cells. 
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apply to employers with fewer than 15 employees and does not cover 
insurance for long-term care, life or disability. It also does not apply to 
people employed and insured by the military’s Tricare system, such as 
Rylan Gorby. When his son’s genome was sequenced, researchers also 
obtained permission to sequence Rylan’s genome, to determine if he 
was a carrier for the rare hemoglobin condition. Because it manifests 
itself only in childhood, Gorby decided taking the test was worth the 
risk of possible discrimination. 

Cost is another consideration. Clinical sequencing is still about 
$500 to $800, and interpretation can be upward of $1,000, according 
to Brigham and Women’s Green. For families who can’t afford health 
insurance, this is out of reach. Some experts have also raised concerns 
that genetic testing could lead to a lot of follow-up testing with special- 
ists, which could overburden an already resource-strapped health care 
system. If sequencing turns out to save money in the long run, insur- 
ance companies may cover it, but there’s no guarantee. 

Yet another problem is that the majority of the sequencing to date 
has been done in babies whose families are well-off and white, raising 
concerns that this could become the province of only the privileged. 
And the racial homogeneity could skew the results: diseases more 
prevalent in Caucasian individuals could be overrepresented in test 
panels, whereas illnesses more common in racial minorities may be 
underrepresented. (New medical data projects intend to address this 
disparity [see “All of Us,” on page S14].) 


THE U.C.S.F. NSIGHT PROJECT included a working group that investi- 
gated some of these ethical and policy issues, which culminated in a 
2018 report by the Hastings Center, a bioethics nonprofit in Garrison, 
N.Y. The report concluded that newborn sequencing has many bene- 
fits in helping diagnose sick babies and could expand the number of 
conditions that meet the stringent newborn screening criteria. But us- 
ing genome sequencing as a replacement for newborn screening is “at 
best premature,” the authors say, and direct-to-consumer sequencing 
should not be used for diagnosis or screening purposes. 

Barbara Koenig, a professor of medical anthropology and bioeth- 
ics at U.C.S.F and one of the report’s co-authors, underscores the fact 
that sequencing, while promising, is not yet mature enough to be 
routinely used to screen healthy children. “This is not a technology 
that’s ready for prime time for use in healthy infants,” Koenig says. 

Despite these concerns, the era of newborn sequencing is now 
upon us, and the practice will likely become more widespread as costs 
come down and the results become more accurate and useful. In the 
meantime, the risks and benefits of sequencing must be weighed on 
an individual basis. Extremely sick newborns are a completely differ- 
ent case from apparently healthy children of worried parents suscepti- 
ble to marketing from genetic-testing firms. 

For Mitchell Gorby, sequencing was certainly worth it. Two 
months after leaving the hospital, he is doing fine and has doubled his 
weight. His parents are settling into their new routine, somewhat 
sleep-deprived, but happy to be home with their healthy baby boy. 


Tanya Lewis is an associate editor who covers health and medicine 


at Scientific American. 
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of disputed results, treatment failures and some deaths in 
experimental trials, the first gene therapy drug, for a type of skin 
cancer, was approved in China in 2003. The rest of the world 
was not easily convinced of the benefits, however, and it was not 
until 2017 that the U.S. approved one of these medicines. Since 
then, the pace of approvals has accelerated quickly. At least nine 
gene therapies have been approved for certain kinds of cancer, 
some viral infections and a few inherited disorders. A related 
drug type interferes with faulty genes by using stretches of DNA 
or RNA to hinder their workings. After nearly half a century, the 


concept of genetic medicine has become a reality. 
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These treatments use a harmless virus to carry a good gene 
into cells, where the virus inserts it into the existing genome, 
canceling the effects of harmful mutations in another gene. 
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the world’s first commercially 
available gene therapy in 
2003 to treat head and neck 
squamous cell carcinoma, a 
form of skin cancer. Gendicine 
is a virus engineered to carry 
a gene that has instructions for 
making a tumor-fighting pro- 
tein. The virus introduces the 
gene into tumor cells, causing 
them to increase the expres- 
sion of tumor-suppressing 
genes and immune response 
factors.The drug is still 
awaiting FDA approval. 
GLYBERA: The first gene 
therapy to be approved in 

the European Union treated 
lipoprotein lipase deficiency 
(LPLD), a rare inherited 
disorder that can cause 
severe pancreatitis. The drug 
inserted the gene for lipopro- 
tein lipase into muscle cells. 
But because LPLD occurs 

in so few patients, the drug 
was unprofitable. By 2017 
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its manufacturer declined 

to renew its marketing 
authorization; Glybera is 

no longer on the market. 
IMLYGIC: The drug was 
approved in China, the 

U.S. and the E.U. to treat 
melanoma in patients who 
have recurring skin lesions 
following initial surgery. 
Imlygic is a modified genetic 
therapy inserted directly into 
tumors with a viral vector, 
where the gene replicates 
and produces a protein 

that stimulates an immune 
response to kill cancer cells. 
KYMRIAH: Developed for 
patients with B cell lympho- 
blastic leukemia, a type of 
cancer that affects white blood 
cells in children and young 
adults, Kymriah was approved 
by the FDA in 2017 and the 
E.U. in 2018. It works by 
introducing a new gene into 
a patient’s own T cells that 
enables them to find and kill 
cancer cells. 


LUXTURNA: The drug 
was approved by the FDA 
in 2017 and in the E.U. in 
2018 to treat patients with 
a rare form of inherited 
blindness called biallelic 
RPE65 mutation-associated 
retinal dystrophy. The 
disease affects between 
1,000 and 2,000 patients 
in the U.S. who have a 
mutation in both copies 

of a particular gene, 
RPE65. Luxturna delivers 
anormal copy of RPE65 
to patients’ retinal cells, 
allowing them to make 

a protein necessary for 
converting light to electrical 
signals and restoring 

their vision. 

STRIMVELIS: About 15 
patients are diagnosed in 
Europe every year with 
severe immunodeficiency 
from a rare inherited 
condition called adenosine 
deaminase deficiency 
(ADA-SCID). These 
patients’ bodies cannot 
make the ADA enzyme, 
which is vital for healthy 
white blood cells. Strimvelis, 
approved in the E.U. in 
2016, works by introducing 
the gene responsible for 
producing ADA into stem 
cells taken from the pa- 
tient’s own marrow. The 
cells are then reintroduced 
into the patient's blood- 
stream, where they are 
transported to the bone 
marrow and begin 
producing normal white 
blood cells that can 
produce ADA. 
YESCARTA: Developed to 
treat a cancer called large 
B cell lymphoma, Yescarta 
was approved by the FDA 
in 2017 and in the E.U. in 
2018. It is in clinical trials 
in China. Large B cell 
lymphoma affects white 
blood cells called lympho- 
cytes. The treatment, part of 
an approach known as 


CAR-T cell therapy, uses 

a virus to insert a gene that 
codes for proteins called 
chimeric antigen receptors 
(CARs) into a patient's 

T cells. When these cells 
are reintroduced into the 
patient’s body, the CARs 
allow them to attach to 
and kill cancer cells in 

the bloodstream. 
ZOLGENSMA: In May 
2019 the FDA approved 
Zolgensma for children 
younger than two years with 
spinal muscular atrophy, a 
neuromuscular disorder that 
affects about one in 10,000 
people worldwide. It is one 
of the leading genetic 
causes of infant mortality. 
Zolgensma delivers a 
healthy copy of the human 
SMN gene to a patient's 
motor neurons in 

a single treatment. 
ZYNTEGLO: Granted 
approval in the E.U. in 

May 2019, Zynteglo 

treats a blood disorder 
called beta thalassemia 
that reduces a patient's 
ability to produce hemo- 
globin, the protein in red 
blood cells that contains 
iron, leading to life- 
threatening anemia. The 
therapy has been approved 
for individuals12 years and 
older who require regular 
blood transfusions. It em- 
ploys a virus to introduce 
healthy copies of the gene 
for making hemoglobin 
into stem cells taken from 
the patient.The cells are 
then reintroduced into 

the bloodstream and 
transported to the bone 
marrow, where they 

begin producing healthy red 
blood cells that can 
manufacture hemoglobin. 


GENE INTERFERENCE 

This approach uses a synthetic strand of RNA or DNA 
(called an oligonucleotide) that, when introduced into 
a patient’s cell, can attach to a specific gene or its 
messenger molecules, effectively inactivating them. 
Some treatments use an antisense method, named for 
one DNA strand, and others rely on small interfering 
RNA strands, which stop instruction molecules that go 
from the gene to the cell's protein factories. 


DEFITELIO: This drug down production of— 
contains a mixture a protein that helps to 
of single-strand produce low-density lipo- 


oligonucleotides protein (LDL). Injected 
obtained from the — subcutaneously, this 
intestinal mucosa of therapy is used to lower 
pigs. It was approved LDL levels in patients 
(with limitations) inthe — who have dangerously 
U.S. and the E.U. in high cholesterol. 
2017 to treat severe MACUGEN: Age-related 
cases of veno-occlus- macular degeneration is 
ive disease, a disorder the leading cause of vision 
in which the small loss in people age 60 
veins of the liver and older. It is caused by 
become obstructed, deterioration of the center 
in patients who have of the retina due to leaking 
received a bone mar- blood vessels. Approved 
row transplant. in the U.S., Macugen 
EXONDYS 51: In 2016 inhibits these blood 
the FDA granted approval vessels from growing 
to Exondys 51 amidsome —_ under the retina, thus 
controversy regarding treating the disorder. 
its efficacy; two members §SPINRAZA: With its FDA 


of the FDA review panel approval in 2016, Spin- 
resigned in protest of the raza became the first 
decision. The therapy is gene-based therapy 
designed to treat a form for spinal muscular 

of Duchenne muscular atrophy. The inherited 


dystrophy caused by disorder is caused by 
mutations in the RNA that low levels of SMN, a 
codes for the protein that key protein for the main- 


helps to connect muscle tenance of motor 
fibers’ cytoskeletons to neurons. Spinraza binds 
a surrounding matrix. to RNA from a “backup” 


Exondys 51 is effective gene called SMN2, 
in treating about converting that RNA 
13 percent of the into instructions for 
Duchenne population. making fully functioning 
KYNAMRO: Approved SMN proteins. 
by the FDA in in 2013, 
Kynamro is designed to Jim Daley is a freelance 
inhibit—or effectively shut journalist based in Chicago. 
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BIG DATA 


DNA-based medicine needs more 
diversity to avoid harmful bias. One 
big research project is fixing that 


By Stephanie Devaney 


WHEN THE RACE TO sequence the first 
human genome was rushing toward the 
finish line about 20 years ago, I remember 
feeling mesmerized by what was about 
to happen. It was the dawn of a new cen- 
tury, and it seemed we were on the cusp 
of unlocking the meaning behind the 
blueprint of life, DNA. Once we could 
line up all 3.1 billion base pairs of the 
molecule in our genome, I thought— 

I was an undergraduate student at the 
time, dazzled by science—we would 
understand everything there is to know 
about human health and disease. 


What I didn’t know was that those first decades of genetic 
medicine would leave a lot of people behind. So I was taken 
aback several years later, in 2009, just after I got my doctorate 
in molecular genetics, when researchers at Duke University 


reported that 96 percent of the genomic data we had gath- 
ered came from people of European ancestry. This was not 
the result of small numbers: they calculated the percentage 
using the more than 1.7 million individual genome samples 
analyzed at the time, but the samples were lacking diversity. 
Over the next few years things did not get much better, and 
as recently as four years ago genomic databases were still way 
out of balance, with more representation of Europeans and 
less of everyone else. 

This inequity, ifit is not fixed, will turn into tremendous 
health inequality. Today more and more people are getting 
answers about the underlying causes of their diseases be- 
cause of medicine’s ability to mine their genomes. There are 
hundreds of drugs that contain genetic information in their 
labeling because gene variants affect how bodies process 
these drugs, and knowing the variants that patients have 
helps doctors set the most beneficial dose for their patients. 
Moreover, today improved knowledge about the genomic 
drivers of different cancers has paid dividends in how physi- 
cians diagnose and treat many tumors. Yet people who are 
not white and not male have different sets of genes that do 
not always fit into these treatment regimens. 

For example, African-Americans and Latinos have the 
highest rate of asthma in the U.S., but studies show that 
common drugs used in inhalers do not help them as well as 
they help whites. Asians who take the antiseizure drug car- 
bamazepine have a higher risk of a severe, sometimes fatal, 
reaction. Nobody developing these drugs, or prescribing 
them when they first came into use, anticipated these prob- 
lems. If DNA is one important factor in our quest for more 
effective medical treatment, we need to address the lack of 
diversity in genetic data. 

That is where the A// of Us Research Program, where I 
work, hopes to help. Set up by the National Institutes of 
Health and launched in 2018, we are asking a million or 
more people from all backgrounds to join us as partners in 
research, not as human subjects, and share all kinds of health 
information over the course of their lives. Already we have 
more than 250,000 participants. More than 51 percent be- 
long to racial and ethnic minorities, more than 10 percent 
are sexual and gender minorities, and overall more than 
80 percent represent a group that has been historically un- 
derrepresented in research data sets. 

People can join All of Us by going to our program Web 
site (www.joinallofus.org) and clicking “Join Now.” After 
agreeing to participate, respondents can offer us their med- 
ical records, answer a variety of surveys about their health 
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Biased Gene Studies 


To link genes to disease risk and other traits, hundreds of genome-wide 
association studies (GWAS) have looked at the DNA of thousands of 
different people as of 2018. But in terms of racial background, these 
people are not so different. Taking all the projects together, 78 percent 
of the people in them are white Europeans, whereas just 2 percent are 
African and 1 percent are Hispanic or Latin American. The studies them- 
selves also predominantly focused on Europeans and rarely on other 
populations. So gene variants that appear in non-European people and 
may be linked to illness rarely show up in this research. The scarcity 
makes it hard to analyze and understand the significance of the variants. 


Racial Backgrounds in Published Gene-Association Studies 
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and lifestyle, and participate in other activities such as sync- 
ing their fitness tracker data to our program. We also have 
hundreds of enrollment sites at local hospitals and health 
centers across the country where participants can provide 
samples of blood and urine to help researchers study their 
DNA. Our hope is for people to stick with us for 10 years or 
more because, as the program grows, we will regularly add 
new ways for them to learn about themselves and contrib- 
ute to research. 


THE MOMENT IS RIGHT 

ALOT OF THIS PARTICIPANT-RESEARCHER collaboration is 
linked to advances in technology. Sequencing that first human 
genome hada $1-billion price tag. Today such a sequence costs 
less than $1,000 and can take less than 24 hours to complete. 
Iis also easier to integrate this information with other crucial 
medical data. Health care organizations have been turning 
their patients’ paper-based medical records into electronic ver- 
sions. As of 2017, 96 percent of all U.S. hospitals and 80 per- 
cent ofall office-based doctors are using a certified electron- 
ic health record system. New apps on smartphones and oth- 
er digital health technologies such as smart watches collect 
data from nearly anywhere and directly from a person. These 
trends all make it easier to store, share and mine large data 
sets for answers to questions about disease causes and effects. 
Such trends also raise big and disturbing issues about priva- 
cy, making it important for projects such as ours to have both 
strong security and full transparency to all our participants. 

And it is crucial to treat these people as partners. The ac- 
tions of past medical researchers have earned much distrust 
in minority communities, after causing harm in the Tuske- 
gee Syphilis Study, where researchers misled African-Amer- 
ican men with syphilis and never gave them adequate treat- 
ment, and with the widespread use of HeLa cells, which 
were taken from a patient named Henrietta Lacks without 
her knowledge or permission. People wanted to see research 
go forward but with them rather than about them. To over- 
come this kind of distrust, All of Us is using a new model 
for research, one that invites input from participants as well 
as researchers with science degrees. Participants serve on the 
program’s advisory and governing bodies, working groups, 
and task forces. We have also partnered with local health 
care organizations, hospitals, and community groups to ad- 
vise us and help find people to participate. Community en- 
gagement is not familiar ground for large medical research 
projects, and we are still learning the best ways to do it. 

Some studies have provided us with blueprints for devel- 
oping long-term relationships like the ones we hope to 
have, studies that have changed medicine for the better. The 
Framingham Heart Study, for example, started in 1948 with 
5,209 men and women, largely white, from one town in 
Massachusetts. With a 99 percent retention rate, the study 
continues to this day. As participants share data year after 
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year, researchers can see how their heart health changes over 
time. The risk factors for heart disease identified by the 
Framingham study—such as high blood pressure, high 
cholesterol, smoking and obesity—are so ingrained in our 
collective consciousness and our approach to health care 
that they feel like common sense. 


GOING FURTHER 

THIS KIND OF MEDICAL DISCOVERY is what we envision for 
All of Us, but we want to take it further, with participants 
who are not all white and who represent diversity in many 
dimensions, not just traditional race labels that, in reality, en- 
compass a lot of different backgrounds. If we're going to get 
at the root causes of health and disease, this means under- 
standing the differences and similarities among us all. For ex- 
ample, sickle cell disease occurs when someone inherits two 
mutated genes for the oxygen-carrying protein hemoglobin. 
It affects 100,000 African-Americans and more than 20 mil- 
lion people around the world. In contrast, sickle cell trait— 
meaning just one of these genes is mutated—actually gives 
people an advantage in surviving malaria, which makes evo- 
lutionary sense if your ancestors came from areas such as AF 
rica where malaria is prevalent. New studies, however, have 
found that sickle cell trait might not be as benign as doctors 
used to believe, because it may increase the risk for kidney 
disease. Some African-Americans are more susceptible to this 
risk and some less. There’s clearly more to learn about why 
this might be the case and about how different DNA vari- 
ants might interact to affect the health of people with sickle 
cell trait. The DNA information from more than a million 
All of Us participants could help researchers learn much 
more about complex traits like this. 

We do have to start with some of the broad-brush catego- 
ries to recruit enough people to start recognizing the more 
fine-grained groups among them. Currently we are exceeding 
our goal of overrepresenting groups that have been historical- 
ly underrepresented in research. For instance, African-Ameri- 
cans make up about 13 percent of the U.S. population but 
just 3 percent of the samples previously used in genome stud- 
ies. In All of Us, 21.5 percent of participants so far are Afri- 
can-American. Similarly, Hispanics constitute about 18 per- 
cent of the U.S. population but in 2016 made up less than 
1 percent of the data in our genomic databases. Today 
17.6 percent of All of Us participants are Hispanic. 

That diversity will help us discover more about how 
DNA affects health across different communities, but the 
molecule will not be our sole focus. Many factors beyond 
our genes are at play when it comes to disease. We know 
that where you were born, what you eat, the stress you feel, 
and other clinical and biological factors affect health, but 
we still don’t understand by how much. For example, when 
we think about some of the most common chronic diseases 
that afflict our population—high blood pressure is one ex- 
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A Better Balance 


A new precision medicine project, All of Us, has much larger populations 
of groups that have been historically underrepresented in genetics re- 
search. The project, sponsored by the U.S. National Institutes of Health, 
began recruiting participants in 2018. More than 250,000 people en- 
rolled by October 2019, and just over 20 percent are black, African- 
American or African. About 18 percent are Latino, Hispanic or Spanish. 
Nearly 3 percent are Asian, and 6.7 percent are of mixed races. Slightly 
less than half of the people are white. The project's goal is to get DNA 
and other health information from more than one million people. 
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ample—many of them disproportionately affect the most 
socially and economically disadvantaged people in our 
country. And from what we can tell at the moment, the de- 
terminants are not simply their race or ethnicity. Risks also 
include family structure, socioeconomic status, stressors 
such as trauma, sex and gender inequality, availability of 
nutrient-rich foods, access to health care, and many other 
factors that we can capture in the All of Us data set. 

Within the next several years, we should be able to com- 
pare this rich set of information with participants. DNA. 
When we do so, scientists such as myself, the All of Us par- 
ticipants and all of you will start to get a clearer picture of the 
roles that biology and environment play in disease develop- 
ment, and—most important of all—what we can do about it. 
Molecular geneticist Stephanie Devaney is deputy 
director of the All of Us research program at the National 
Institutes of Health. She was the staff lead for the White 
House Precision Medicine initiative. 
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